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Abstract 

Observations of the high-redshift Universe with the 21 cm hyperfine line of neutral 
hydrogen promise to open an entirely new window onto the early phases of cosmic 
structure formation. Here we review the physics of the 21 cm transition, focusing 
on processes relevant at high redshifts, and describe the insights to be gained from 
such observations. These include measuring the matter power spectrum at z ~ 50, 
observing the formation of the cosmic web and the first luminous sources, and map- 
ping the reionization of the intergalactic medium. The epoch of reionization is of 
particular interest, because large HII regions will seed substantial fluctuations in the 
21 cm background. We also discuss the experimental challenges involved in detecting 
this signal, with an emphasis on the Galactic and extragalactic foregrounds. These 
increase rapidly toward low frequencies and are especially severe for the highest red- 
shift applications. Assuming that these difficulties can be overcome, the redshifted 
21 cm line will offer unique insight into the high-redshift Universe, complement- 
ing other probes but providing the only direct, three-dimensional view of structure 
formation from z ~ 200 to z ~ 6. 
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1 Introduction 



1.1 From the Dark Ages to Reionization 

Perhaps the most compelling story in all of astrophysics is the formation of 
structure in our Universe: how the exceedingly complex objects that surround 
us today grew out of the remarkably simple and smooth medium that emerged 
from the Big Bang. Recent decades have seen enormous progress in disentan- 
gling many of the threads in this story, and the basic paradigm for structure 
formation is now in place. In the resulting picture, the tiny (~ 10 -5 ) density 
fluctuations that we observe in the cosmic microwave background (CMB) grew 
through gravitational instability until they collapsed into the "cosmic web" of 
sheets, filaments, and halos that we see around us today. 

This overarching scheme has been extraordinarily successful in explaining ob- 
servations of both the early Universe and local structures. But two significant 
gaps remain in the story: we have yet to directly observe the cosmic "dark 
ages" between the last scattering surface of the CMB and the formation of 
the first luminous structures, and we are only just now beginning to explore 
the era of "first light" that stretches from the formation of these sources to 
the full reionization of the intergalactic medium (IGM). These epochs - be- 
tween z ~ 1000 and z ~ 6 - constitute the next frontier of observational 
cosmology, where we can directly study the transition between the linear and 
nonlinear regimes of gravitational growth, the characteristics of the first stars 
and quasars, and their influence on the Universe around them. They offer an 
opportunity to connect our detailed pictures of the early Universe with the 
galaxies around us, thus completing the narrative of structure formation. 

The physics content of the dark ages 2 is sufficiently simple that observa- 
tions of this period could directly constrain cosmology in an analogous way 
to the CMB. After the hydrogen gas recombined, only a few basic processes 
contributed to the Universe's evolution: its expansion, the recombination of 
electrons and protons, the interaction between CMB photons and the resid- 
ual electrons, and gravity. Furthermore, except at the conclusion of this era 
(z < 50), baryonic perturbations remained linear throughout the Universe. 
Thus, given a set of underlying cosmological parameters, we can straightfor- 
wardly compute the distribution of structure and its characteristics at any 
time during the dark ages. As with the CMB, measurements of this era would 
therefore help to constrain the basic global parameters of our Universe, such 
as the baryon and matter densities, the shape of the underlying matter power 
spectrum, and the amplitude of the initial fluctuations. Conversely, any ob- 



2 The name first became part of astrophysical parlance through W. Sargent [1]. 
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served deviation from the expected evolution would be a "clean" signature of 
fundamentally new physics. The dark ages even have one key advantage over 
the CMB: because the IGM 3 is no longer affected by photon diffusion, the 
baryons develop fluctuations on scales down to the Jeans mass in the neutral 
IGM. In principle, this permits tests of the matter power spectrum on much 
smaller scales than the CMB does [2] . Unfortunately, with present technology 
we have no way to access this treasure trove of information. 

Astrophysics became important - and even dominant - during the next phase 
of cosmic history, which stretches from the formation of the first luminous 
sources (most likely at z > 30) to "reionization," when stars and quasars ion- 
ized hydrogen in the IGM. This era obviously holds a great deal of interest for 
cosmologists. Most important is the simple desire to study the first generation 
of stars and galaxies. But it could also answer any number of fundamental 
astrophysical questions. Although hierarchical structure formation - in which 
small dark matter halos and galaxies collapse first, later merging into larger 
objects - most likely describes this well as it does the z < 5 Universe, 

it leaves many crucial questions unanswered. Through which processes did 
the first stars form? How massive were they, and do fossils remain in the local 
Universe? When did heavy elements first form, and what processes distributed 
them throughout galaxies and the IGM? Was the IGM clumpy at these early 
epochs, or did it remain smooth until later on? How did feedback regulate the 
formation of galaxies, and what types - radiative, mechanical, or chemical - 
were most important? How and when did the first supermassive black holes 
form, and what role did they play in galaxy formation? 

A particularly fascinating set of questions relate to the epoch of reionization, 
the hallmark event of the high-redshift Universe. It is the point at which 
structure formation directly affected every baryon in the IGM, even though 
only a small fraction of them actually resided in galaxies. It also marked an 
important phase transition for galaxies: once the IGM was ionized, it became 
transparent to ultraviolet (UV) photons - a dawn (of sorts) for the young 
galaxies inhabiting the high-redshift Universe. 

Precisely because its astrophysics is so rich, this phase is much easier to explore 
than the dark ages - although, even so, only in the last few years has it finally 
become accessible. So far, nearly all of the observational attention has focused 
on understanding reionization itself (see [3] for a recent review). Figure 1 
summarizes all the existing direct measurements of the IGM ionization state 
at z > 5. The most straightforward come from quasar absorption spectra: 
as at lower redshifts, the Lya forest offers a powerful window into the IGM 
and specifically the ionized fraction Xi (or the neutral fraction xm] we will 



3 Of course, the "intergalactic medium" is a misnomer at these times, but we will 
nevertheless use it without reservation. 
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Fig. 1. Compilation of direct observational constraints on reionization (see also [3]). 
The Lya forest measurements with errors and the proximity zone point in the lower 
panel are taken from [7,12], the other Lya forest points from [12], the proximity zone 
point in the upper panel from [14,15], the Lya galaxy constraint from [16-18], and 
the GRB point from [19]. The shaded box shows the la errors on the reionization 
redshift from the 3-year WMAP data [20], assuming that it is instantaneous. Note 
that these are approximate limits (at best) and depend upon a number of theoretical 
assumptions (see text). 

use overbars to denote global averages of these quantities). With its large sky 
coverage, the Sloan Digital Sky Survey (SDSS) has proven particularly useful 
in extending this test to high redshifts and has (to date) identified 19 bright 
quasars at z > 5.74 [4-7]. Unfortunately, inferring the neutral fraction from 
these measurements requires a theoretical model, because only the rarest voids 
in the IGM allow light to pass through; thus only the tail of the IGM density 
distribution is directly sampled, and its properties must be extrapolated to the 
bulk of the matter [8,9]. Inserting a model based on simulations of the lower- 
redshift Lya forest [10] (shown by the solid triangles) implies that the neutral 
fraction evolved rapidly at z > 5 [7, 11]; because we expect reionization to 
proceed rapidly, this may indicate that it ended at about this time. However, 
other models are consistent with a much more gentle evolution; for example, 
extrapolation from empirical fits to Lya forest measurements at 1.7 < z < 5.6 
(shown by the crosses) requires no break in the smooth evolution at z > 6 [12] 
(see also [8, 13]). 

However, because the Lya forest probes specific lines of sight and resolves fea- 
tures radially, it contains much more information than just Xi(z). One example 
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is the "proximity zone," which is the region of the IGM directly influenced 
by the ionizing radiation from the background quasar itself (and thus more 
highly ionized than average). Before reionization is complete, the extent of 
the proximity zone measures the number of photons consumed in ionizing the 
initially neutral gas (and hence Xi) - at least if the HII region has not yet 
reached the Stromgren limit [21,22]. This test has been used in three differ- 
ent ways. First, the sizes clearly decrease with increasing redshift [7], which 
requires an increasing neutral fraction from z ~ 5.8 to z ~ 6.3 (shown by 
the lower open square on Figure 1; note that the errors on this extrapola- 
tion are hard to quantify and were not reported by [7]). Second, the inferred 
spatial extent of the region with non-zero transmissivity appears to indicate 
xei > 0.2 [15,23]. Unfortunately, this constraint is degenerate with the quasar 
lifetime and luminosity and is subject to uncertainties in defining the "edge" 
of the proximity zone in an inhomogeneous IGM [7]. Fortunately, the shape of 
the edge can help disentangle these uncertainties. The third constraint comes 
from such a measurement: one quasar (at z = 6.28) shows a significant decline 
in the Lya flux well before Ly/3 absorption becomes strong. This suggests a 
substantial damping wing, which requires Xm > 0.2 [14]. 

The shape of the IGM Lya damping wing depends on the total optical depth, 
so it provides another sensitive test of the neutral fraction [24]. However, in 
practice, measuring the detailed line profile is difficult. For example, the in- 
trinsic Lya lines of quasars compromise its measurement in those systems. 
Gamma-ray bursts (GRBs), which result from the deaths of massive stars, 
may provide better sources. Their afterglow spectra are (intrinsically) fea- 
tureless power laws (see [25] for a review), providing simple templates for 
extracting the damping profile [26]. Given that they are associated with star 
formation, GRBs also ought to occur at high redshifts (see [27] for one es- 
timate), and cosmological time dilation allows the most distant bursts to be 
identified with realistic instruments [28,29]. Indeed a GRB was recently iden- 
tified at z — 6.3 [30]. Unfortunately, it now appears that most GRB spectra 
contain damped-Lya absorbers (DLAs) at the systemic redshifts of their hosts, 
as well as a rich set of other absorption features (e.g., [31,32]), most likely be- 
cause the GRBs are embedded in rapidly star-forming regions. Such DLAs 
also compromise attempts to measure the IGM absorption. However, it is still 
possible to constrain the IGM damping wing, because it has a different shape 
than prototypical DLAs. The z = 6.3 GRB has no evidence for IGM absorp- 
tion, setting an upper limit of xm < 0.6 (at 95% confidence) along this line of 
sight [19]. 

The sightline-to-sightline variations toward different quasars provide another 
approach. The observations show surprisingly large variations on ~ 100 Mpc 
scales [7, 33] . Some authors have argued that these require large-scale fluctu- 
ations in the ionizing background, indicative of the IGM during or soon after 
reionization [7, 34] . However, the statistics of the Lya forest at low trans- 
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mission are extremely hard to interpret because of aliasing and bias, so the 
evidence is not yet conclusive [35,36]. 

We can also learn about reionization through galaxy surveys. In particular, the 
Lja emission lines of galaxies must vanish as the IGM becomes more and more 
neutral, because the enormous Lja optical depth of a neutral IGM eliminates 
even those photons on the red side of the line [37-39]. In practice this test 
is difficult to make quantitative because of uncertainties in the intrinsic Lja 
line properties of the galaxies (and in particular winds) [40]. But surveys for 
Lya-selected galaxies have proven useful in two ways. First, reasonably large 
samples exist at both z ~ 5.7 and z ~ 6.5, bracketing the reionization epoch 
suggested by SDSS quasars. The narrow range in cosmic time between these 
two epochs implies little intrinsic evolution between these two samples, so any 
observed differences between the populations may be due to evolution in Xj. 
However, the observed samples have no statistically significant difference in 
number density [16]. Unfortunately, turning this into a quantitative constraint 
on xm is difficult because of questions about the intrinsic galaxy population. 
Treating each galaxy in isolation implies xm < 0.3 [16,41]; including clustering 
weakens the constraint to xm < 0.5 [17]. A complementary second constraint 
comes from requiring each observed z ~ 6.6 galaxy to be surrounded by a 
sufficiently large HII region to allow transmission. This again implies xhi 0.5 
[18], although it is also subject to uncertainties about clustering and the bubble 
size required for transmission. 

Finally, reionization affects the CMB. Thomson scattering of CMB photons 
by free electrons washes out temperature fluctuations and generates large- 
scale polarization anisotropies [42]. The first-year measurements of the Wilkin- 
son Microwave Anisotropy Probe ( WMAP) detected a substantial large-scale 
correlation between their temperature and polarization maps, indicating a 
large optical depth for this scattering process (r cs ~ 0.17) and correspond- 
ingly early reionization [43,44]. However, the three- year data now available 
yields a much smaller optical depth, r es = 0.088^o;q34 from the WMAP data 
alone or r cs = 0.069^o;o29 when combined with a suite of other cosmological 
measurements [20,45]. With better data, the temperature-polarization cross- 
correlation is no longer detected and the constraints rely instead on the actual 
polarization power spectrum, which is only detected at about the 3a level. 
This illustrates the difficulty of removing strong foregrounds - a lesson which 
should be noted for our later discussion. Nevertheless, at least when taken at 
face value these lower measurements still imply a prolonged reionization epoch. 
In Figure 1, the shaded box shows the WMAP constraints on the reionization 
redshift, assuming an instantaneous transition. Of course this is an approxi- 
mation: because the optical depth provides only an integral measurement, it 
is difficult to apply it directly to a plot of Xi(z). 

Although these observations are beginning to constrain reionization, countless 
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questions remain. Most importantly, what sources drove it? The standard as- 
sumption is that ionizing photons from massive, hot stars inside small galaxies 
leaked into the IGM, producing HII regions that grew with the galaxies until 
they merged and filled the Universe. Although theoretically attractive, direct 
evidence for such a process is still lacking. Could accreting black holes have 
played a part? Did metal-free Population III stars contribute significantly? 
Did recombinations in the IGM significantly delay reionization? Did reioniza- 
tion itself suppress subsequent structure formation? The range of unanswered 
questions is illustrated by the proliferation of theoretical papers over the past 
several years; we refer the interested reader to [46-49] for detailed reviews of 
these issues. 

Unfortunately, existing techniques all have intrinsic drawbacks that ultimately 
limit their utility for studying structure formation at high redshifts (much less 
the dark ages). CMB polarization primarily provides an integrated measure 
of the column depth of ionized gas; information about the reionization history 
(much less about spatial fluctuations at any given time) is difficult to extract 
(see §11.3). The patchiness of reionization also imprints small-scale tempera- 
ture anisotropies on the CMB, but again the resulting constraint is an integral 
along the line of sight (see §11.4). Quasar spectra obviously do provide redshift 
information, but they suffer from saturated absorption at z > 6 and can shed 
light only on the late stages of reionization (see §11.2). Finally, galaxy sur- 
veys will undoubtedly be tremendously useful for understanding high-redshift 
galaxies, but they are less than ideal for studying reionization. First, these 
galaxies are so distant and faint that collecting substantial numbers of objects 
at z > 8 must await new instruments such as the James Webb Space Tele- 
scope or a 30-meter near-infrared telescope. Second, the objects themselves 
will only offer indirect information on the IGM (see §11.1) and reionization, 
so such surveys will still leave many questions unanswered. 

Thus, new probes of the high-redshift Universe - including reionization, ear- 
lier phases of nonlinear structure formation, and the dark ages - are urgently 
needed. Our purpose is to review arguably the most promising of these tech- 
niques: the redshifted 21 cm line of HI. This transition, in which the electron 
flips its spin relative to the nucleus, has a long and successful history in as- 
tronomy, and in fact its cosmological implications were realized nearly half a 
century ago (see §1.2). Put simply, the goal is to use this line to map the neutral 
gas before and during reionization. The 21 cm transition has three enormous 
advantages. First, as a spectral line, redshift information can be used to trace 
the entire three-dimensional history. Second, it directly probes IGM gas, which 
contains the vast majority of baryonic matter. (Of course, this is actually a 
disadvantage if one is interested in studying the galaxies themselves.) Finally, 
as a forbidden transition with a mean lifetime of ~ 3 x 10 7 yr, the 21 cm line 
is far from saturation and is thus sensitive to the (most interesting) middle 
stages of reionization. 
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Although these advantages have long been recognized (e.g., [50-52]), only re- 
cently has technology improved to the point where high-redshift 21 cm line 
observations 4 are feasible. As a result, recent years have seen a substantial ef- 
fort in theoretically modeling the expected signals and in planning or building 
instruments to observe them. These efforts include CoRE (Cosmic Reioniza- 
tion Experiment) at ATNF (Australia Telescope National Facility), designed 
to measure the mean background as a function of frequency, and a suite of 
antenna arrays designed to search for small-scale fluctuations in the 21 cm 
background. The latter set includes the 21 Centimeter Array (21CMA), LO- 
FAR (Low Frequency Array), the Precision Array to Probe Epoch of Reion- 
ization (PAPER), the Mileura Widefield Array Low Frequency Demonstrator 
(which we will refer to as the MWA), and the Square Kilometer Array (SKA). 
All of these experiments (except for the last one) should begin gathering data 
within the next several years. They hope to constrain the reionization epoch 
itself at z < 12; observing the higher-redshift Universe at lower frequencies, 
where foregrounds are much stronger, will require much larger instruments. 

This review will focus on theoretical predictions of the 21 cm signal over a 
range of cosmic time - from z > 50, when it allows us to probe the growth of 
the earliest structures, to z ~ 6, when reionization ends. As we will see, such 
predictions will be crucial to interpreting the data once it arrives, because the 
signals will (at least with the first generation of experiments) be buried within 
the huge foreground noise. The models described here can help to isolate the 
cosmological signal, and they also illustrate the potential rewards of these 
observations - at least those we know of today. 

This paper is organized as follows. The remainder of this section provides a 
brief historical overview of 21 cm line cosmology (§1.2) and presents some 
basic notation and formulae (§1.3). The next three sections review the funda- 
mentals necessary for understanding the 21 cm signal. We then describe the 
relevant physics of the 21 cm transition in §2. From there, we forge ahead 
applying this knowledge to the high-redshift Universe. First, in §3 we examine 
the mean evolution of the 21 cm background. In §4 we shift gears to describe 
the power of statistical measurements of fluctuations in the 21 cm background; 
we then explore applications of this technique to the dark ages (§5), the forma- 
tion of the first nonlinear structures (§6), the first luminous sources (§7), and 
finally to reionization itself (§8). These four sections are (for the most part) 
independent of each other and need not be read sequentially. We put these 
efforts in an observational context in §9 and discuss instrumental sensitivities 
and the expected systematic challenges. We then describe a complementary 
approach that uses bright background sources to illuminate the "21 cm forest" 



4 Note that here, and throughout this review, we will abuse the phrase "21 cm" 
by applying it to this high-redshift regime without qualification, even though the 
observed wavelengths are of course 21(1 + z) cm. 
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of absorption features in §10. Finally, we connect 21 cm experiments to other 
observational probes in §11, and we offer some concluding thoughts in §12. 



1.2 Historical Overview 

The 21 cm transition is a rare example (for astronomy) of a successful theo- 
retical calculation driving observational effort: 5 Indirectly inspired by Oort's 
hope that then-unknown radio lines would help in studying our Galaxy, the 
physicist van de Hulst [53, 54] first computed the transition frequency of the 
hyperfine transition of HI during the Second World War and immediately 
realized its usefulness for tracing Galactic structure and kinematics. Several 
groups of physicists, astrophysicists, and engineers subsequently detected the 
line in 1951 [55-57], followed shortly by observations of spiral structure in the 
Milky Way [58] and the measurement of the rotation curve of M31 [59]. At the 
same time, theoretical understanding of the atomic physics of the hydrogen 
atom itself continued to improve [60]. 

The 1960s saw a number of observational programs aiming to detect hydro- 
gen in the nearby IGM, through both emission and absorption against bright 
extragalactic radio sources [50,61-65]. It is interesting to note, in fact, that 
searches for IGM absorption along lines of sight to radio galaxies (what we 
call the "21 cm forest" in §10, although then expected to be much more uni- 
form) preceded the discovery of both quasars and the Lyct forest. These placed 
non-trivial limits on cosmology (ruling out a closed universe full of neutral 
hydrogen, for example) but of course were doomed to failure given the highly- 
ionized IGM we now know to exist. At the same time, there was a theoretical 
assessment of excitation conditions in a dilute, neutral IGM [66-68], a topic 
crucial for interpreting observations and that will receive detailed treatment 
in this review (see §2). Field reviewed the status of IGM studies (as of 1972) 
in [69]. 

The great leap to studying the 21 cm line at cosmological distances occurred 
soon after. Zel'dovich's proposal that large-scale structure develops from the 
IGM through collapse of cluster-sized masses [51,70-72], with subsequent frag- 
mentation into galaxies, inspired a number of efforts to observe the 21 cm 
line from z > 3. This "top-down scenario" implied the existence of high- 
redshift "Zel'dovich pancakes" containing ~10 14 M of neutral gas that would 
have been detectable with existing radio telescopes at z > 3-4; the predicted 
brightness temperatures were <5TJ, ~ 0.1-1 K. Table 1 summarizes the obser- 
vational programs driven (at least in part) by these predictions. None were 



5 We regard this episode as inspiration for our review and hope that the theory 
presented here proves even a fraction as useful! 
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Table 1 

Surveys for high-redshift HI 21 cm line emission. We list the emission redshift, 
the observed frequency, the telescopes used to observe the field, the date of first 
publication, and references. Here the VLA is the Very Large Array, the WSRT is 
the Westerbork Synthesis Radio Telescope, and the GMRT is the Giant Metrewave 
Telescope. 

successful, though many reached limiting HI masses comparable to predic- 
tions (and a tentative detection [73] did spur a brief flurry of excitement from 
theorists [74,75]). They also encountered many of the same difficulties that 
modern experiments will face, particularly terrestrial interference and fore- 
ground subtraction, and pioneered many of the strategies to overcome these 
challenges. 

There was also some observational motivation for expecting an increased HI 
content at z > 3, because the early results from surveys for DLA absorp- 
tion lines against high-z quasars [76] appeared to indicate a factor ~10 over- 
abundance of HI compared to the present epoch. This has proved to be a red 
herring, because better measurements of the DLA statistics [77] and refine- 
ments to the cosmological model now make it clear that the HI content at 
z ~ 3 is unlikely to be more than a factor two above the z ~ value [78]. 

Of course, we are now convinced that structure forms "bottom-up," with the 
smallest halos collapsing first and subsequently merging into larger objects. In 
these kinds of scenarios, massive, neutral pancakes would not form at high red- 
shifts, and the lack of success in the surveys listed in Table 1 is not surprising. 
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The first predictions in the context of this kind of model came in 1979 [52], 
but they received little attention over the next decade (largely because, in 
contrast to the pancake scenario, observations could not yet constrain them). 
Scott & Rees [94] returned theoretical attention to 21 cm emission from high 
redshifts, examining the signals expected at z — 8.4 in a variety of galaxy for- 
mation models. They pointed out that, although the structures in a cold dark 
matter (CDM) model would be extremely faint at high redshifts, statistically 
measuring their scale-dependent fluctuations could constrain the matter power 
spectrum (remarkably similar to the strategies now under development). 

During the remainder of the 1990s, the Cosmic Background Explorer's mea- 
surements of the CMB, together with a wide array of other observations, 
cemented our cosmological paradigm. Thus during this time, 21 cm predic- 
tions based on the CDM model began to appear [95-97], although most still 
focused on imaging rare, massive objects at high redshift. The prospects for 
existing telescopes looked bleak in this model. At the same time, Lya forest 
measurements at z > 5 showed that the Universe remained highly ionized to 
quite large redshifts. Interest in these cosmological applications declined, be- 
cause radio astronomers viewed the z > 5 Universe with trepidation, given the 
brightness of the radio sky and the increasing levels of terrestrially-generated 
radio frequency interference. 

Interest in the 21 cm signal grew again when theorists began to study reioniza- 
tion and the nature of the first ionizing sources. Madau, Meiksin, & Rees [98] 
were the first to consider explicitly the effect of luminous sources on the 21 cm 
signal, emphasizing the role these sources played in setting the spin tempera- 
ture of the IGM. Since then, work has focused less on measuring cosmological 
parameters and more on understanding the interplay between luminous galax- 
ies and the IGM. These questions have become particularly interesting in light 
of the rich but confusing observational constraints on reionization described 
in §1.1. At the same time, technological improvements are bringing the z > 6 
Universe within our grasp. The remainder of this review will describe the cur- 
rent state of theoretical predictions for the 21 cm signal within this "modern" 
approach. 



1.3 Some Preliminaries 

We take this opportunity to summarize some useful parameters and relations. 
In the numerical calculations made specifically for this review, we assume a 
cosmology with Vt m = 0.26, fi A = 0.74, tt b = 0.044, H = lOO/i km s" 1 Mpc" 1 
(with h = 0.74), n = 0.94, and as = 0.8. Here is the density in component % 
relative to the critical value, p c (z) = 3H(z) 2 /871G, H is the Hubble constant 
at the present day, n is the spectral index of the matter power spectrum at 
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inflation, and ag normalizes the power spectrum; it is the amplitude of the 
rms fluctuations smoothed on 8/i _1 Mpc scales. These are consistent with the 
most recent WMAP results [20]. Note , however, that we have increased erg to 
improve agreement with other observations [99,100]. For figures taken from 
other sources, we refer the reader to the referenced works for their assumed 
parameters. Within the existing uncertainties, the choices for these parameters 
do not significantly affect our predictions. The exception is as, which shifts 
the structure formation timetable but otherwise has no qualitative impact. 

In this cosmology, the hydrogen density at z = is tiri = 2.06 x 10~ 7 [(1 — 
Y p )/0.7Q] cm~ 3 , where Y p is the mass fraction of helium. The Hubble constant 
evolves as H(z) = H [Q m (l + z) 3 -\-^l\] 1 ^ 2 or, at the high redshifts in which we 
are interested, H(z) ~ HqVl]1 2 {1 + z) 3 ^ 2 . In this regime, the age of the universe 
is 



Unless otherwise specified, we quote distances in comoving units, r com = (1 + 
z)r pr0 p, defined by 

dr com c , . 

~dz~ = Wz)' [ } 



The angular diameter distance is Da(z) = r com /(l + z), and the luminosity 
distance is Dl{z) = r com (l + z). For convenience, a useful conversion between 
observed angular scale A9 and comoving size at z > 10 is 

rc,™ « 1-9 (jty (^-f 2 h- 1 Mpc. (3) 



A similar approximation for the relation between observed bandwidth Au and 
the radial distance is 

/ Av \ fl + zX 1 ' 2 fn m h 2 \~ 1/2 ^ 



We also note that, according to common usage, we use h to denote both the 
normalized Hubble constant and Planck's constant. The desired meaning will 
normally be clear from context. 

Throughout much of our theoretical discussion, we will require the dark matter 
halo mass function. For simplicity, we will generally use the Press-Schechter 
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form [101] (see [102] for a pedagogical review of the halo mass function and its 
derivation). The comoving number density of halos with masses in the range 
(m, m + dm) is 



n(m)dm — \ 

V n m l 



dlna 



dlnm 



Sc(z) 
a(m) 



cxp 



2a 2 (m) 



dm, 



(5) 



where p is the mean comoving mass density and a 2 (m) is the variance of the 
density field at z = when smoothed in top-hat spheres of mass m: 

(6) 



m) = f ^k 2 P hn (k) 



3ji(fcig) 
kR 



Here Pn n {k) is the linear matter power spectrum (we use the transfer function 
of [103]), ji(x) = (sinx — xcosx)/x 2 , and m = 4npR 3 /3. Finally, S c (z) ~ 
1.69/ 'D(z) is the linearized density threshold for collapse in the spherical top- 
hat model and D(z) is the linear growth factor, normalized so that D(z = 
0) = 1. Note that we use the convention in which a 2 is always evaluated at a 
fixed time, so that S c increases toward higher redshifts. Of course this is only 
for notational convenience; in reality, the threshold for virialization is a fixed 
physical density. 

In detail, equation (5) is not an ideal fit to the halo mass function found in 
numerical simulations at low redshifts, and more accurate forms exist in the 
literature [104, 105]. The high redshift case is less clear; the mass function 
seems to lie in between the Press-Schechter and Sheth-Tormen forms [106- 
109]. Nevertheless, equation (5) suffices for our purposes because the existing 
analytic models have many other uncertainties and approximations. It has the 
enormous advantage of being easy to manipulate analytically. For example, the 
collapse fraction, or fraction of the mass in halos more massive than m m i n , is 
simply 



/coil = J dm m n(m) = eric 



Sc(z) 
V2a(m min ) 



(7) 



We will use / co ii in numerous calculations throughout this review. 
Our Fourier transform conventions follow 
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and 

/(k) = /d 3 x/(x)e- k - x 



(9) 



Finally, for convenience we present a summary table of many of the more 
common symbols in this review, together with references to their definitions 
and brief definitions. 

Table 2: Symbol dictionary. Middle column shows the 
equation or section in which the symbol is first used or 
defined. 



Symbol 


Ref. 


Definition 


A w 


(14) 


Spontaneous decay rate of 21 cm transition 


A e 


(148) 


Effective area of one antenna element 


Atot 


§9 


Total effective collecting area 


B 


(150) 


Total bandwidth of observation 


K 


(123) 


Bias of HII region 


c 


(84) 


Clumping factor 


Coi) Cio 


(19) 


Collisional spin excitation, de-excitation rates 




(102) 


Angular power spectrum of <5 2 i 


C N 


(152) 


Covariance matrix of noise 


c sv 


(153) 


Covariance matrix of sample variance 


D(z) 


(6) 


Linear growth factor 


D A 


(2) 


Angular diameter distance 


D L 


(2) 


Luminosity distance 


D (min, max) 


§9 


Minimum/maximum baseline in interferometer 


/coll 


(7) 


Collapse fraction 


/esc 


(83) 


Escape fraction for ionizing photons 


Me 


(63) 


Helium fraction (by number) 


/rcc(n) 


(57) 


Recycling fraction for Ljn — > Lja photons 


fsh 


(78) 


Fraction of gas shock-heated above T 7 


fx 


(68) 


X-ray luminosity per SFR relative to local value 
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Table 2: Symbol dictionary (continued). 



Symbol 


Ref. 


Definition 


fx, hi fx, ion 


(69) 


Fraction of X-ray energy in heating and ionization 


f* 


(70) 


Star formation efficiency 




(112) 


Coefficient in approximation St = g(z)6 


9i 


(11) 


Statistical weight of level % 


h 


§2.1 


Specific intensity 


J a 


(40) 


J v near Lya ignoring radiative transfer effects 




§2.1 


Angle-averaged specific intensity 


K 


(40) 


Fiducial J v for x a = S a 


I 


(153) 


Distance to 21 cm screen 


n(m) 


(5) 


Halo mass function 


n b (m) 


(122) 


Mass function of HII regions 


nt 


(79) 


Comoving baryon number density 


n(u ± ) 


(151) 


Baseline distribution [also n(k, fj,)] 


n u 


(41) 


Photon occupation number 


N a 


§9 


Number of elements in interferometer 


N B 


§9.1.5 


Number of baselines in interferometer 


N c 


(157) 


Number of measurements in k bin 


N a 


(79) 


Number of (10.2, 13.6) eV photons per stellar baryon 


N- 

1 ' ion 


(83) 


Number of ionizing photons per stellar baryon 




(91) 


Power spectrum of 821 (units: volume) 


Poii Pio 


(19) 


Spin (de-) excitation rate from Lya absorptions 


Pv(Snl) 


(127) 


Volume-weighted distribution of 5 n \ 


P 

1 XX 


(132) 


Power spectrum of ionized fraction 


PxS 


(132) 


Cross-power spectrum of x iy density 


Pa 


(37) 


Total Lya scattering rate 


Pss 


§4 


Matter power spectrum 


Pat 


(157) 


Power spectrum of 8 2 i (units: mK 2 volume) 
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Table 2: Symbol dictionary (continued). 



Symbol 


Ref. 


Definition 


S a 


(40) 


Correction to x a from radiative transfer 


s„ 


§2.1 


Flux density 


tint 


(135) 


Integration time 


*(k,u) 


(151) 


Integration time for the single mode k (or u) 


t y 


(63) 


Compton cooling time 


T b 


§2.1 


Brightness temperature 


T c 


(42) 


Color temperature of Lya background 


T K 


(20) 


Gas kinetic temperature 


T s 


(11) 


21 cm spin temperature 


T 

^sky 


(136) 


Sky temperature 


T 

J-sys 


(135) 


System temperature 


T 7 (z) 


§2.1 


CMB temperature at redshift z 


% 


(11) 


Energy defect of 21 cm transition (0.068 K) 


(u,v) 


(144) 


Sky coordinates (in units of A) 


V 


(143) 


Visibility 


V || 


(15) 


Proper velocity along line of sight 




§9.1.3 


Primary beam of antenna 


x c 


(24) 


Collisional coupling coefficient for T s 


%tot 


(93) 


Total coupling coefficient for T s 


x a 


(40) 


Wouthuysen-Field coupling coefficient for Ts 


xm 


§1-1 


Neutral fraction 


xm 


§1-1 


Globally- averaged neutral fraction 


x% 


§1-1 


Ionized fraction 


x% 


§1-1 


Globally-averaged ionized fraction 


Z c 


§3.5.1 


Redshift at which x a — 1 


Zh 


§3.5.1 


Redshift at which Tk = T 7 in the IGM 




§3.5.1 


Reionization redshift 
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Table 2: Symbol dictionary (continued). 



Symbol 


Ref. 


Definition 


a (A,B) 


§3.4.1 


Recombination coefficient (case-A,B) 


P 


(93) 


Expansion coefficient for 5 in 5 2 i 




(96) 


Expansion coefficient for 5t in S21 


I3 X 


(94) 


Expansion coefficient for S x in 621 


(3a 


(95) 


Expansion coefficient for 8 a in 821 


7 


(47) 


Sobolev parameter 


i 


(50) 


Sobolev parameter including spin exchange 


8 


(15) 


Fractional matter overdensity 


$21 


(91) 


Fractional perturbation to 8T b 


5 c (z) 


(5) 


Linear critical density for halo collapse 


$;oll 


(67) 


Critical overdensity for collisional coupling 


$so 


(106) 


Isotropic component of ^21 


$il 


(127) 


Nonlinear density field 


5t 


(92) 


Fractional perturbation to T K 


s x 


(92) 


Fractional perturbation to neutral fraction 


8 a 


(92) 


Fractional perturbation to x a 


8A 


(148) 


Physical area of one antenna element 


5P AT 


(157) 


Error on power spectrum 


8T b 


(17) 


21 cm brightness temperature relative to CMB 


ST b 


(17) 


Globally-averaged 8T b 


5rj 


(150) 


Inverse bandwidth of measurement 


AT N 


(135) 


Telescope noise 


A 2 

^21 


(91) 


Variance of 21 cm brightness temperature 


M 


(153) 


Radial width of 21 cm screen 


Au 


(135) 


Channel width of observation 


Av D 


(38) 


Doppler width of Lya transition 


6(1/) 


(79) 


Comoving emissivity at frequency v 
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Table 2: Symbol dictionary (continued). 



Symbol 


Ref. 


Definition 




(140) 


Aperture efficiency 


^comp 


(63) 


Heating rate from Compton scattering 


ex 


(70) 


Heating rate from X-rays 




(71) 


Heating rate from Lya scattering 


c 


(83) 


Ionizing efficiency parameter 


V 


(48) 


Recoil parameter 


rf 


(51) 


Recoil parameter including spin exchange 


Vf 


(139) 


Array filling factor 


D 


§9.1 


Diffraction-limited angular resolution 


KlO 


(24) 


Collisional spin de-excitation rate coefficient 


(J, 


(106) 


Cosine of angle between k and line of sight 


^0 


§2.1 


Rest frequency of 21 cm line 


I/ Q 


(38) 


Rest frequency of Lya transition 


a 2 


(6) 


Variance of density field at z = 


7~cs 


§1-1 


CMB electron scattering optical depth 


TGP 


(43) 


Gunn-Peterson optical depth in Lya 


T„ 


(10) 


Optical depth at frequency v 


0a 


(37) 


Profile of Lya line 


0^ 


(12) 


Profile of 21 cm line 


Xa 


(37) 


Line center Lya absorption cross section 



2 Fundamental Physics of the 21 cm Line 

2.1 Basic Definitions 

We begin with some basic definitions necessary for what follows. The funda- 
mental quantity of radiative transfer is the brightness (or specific intensity) 
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I v of a ray emerging from a cloud at frequency v. This conventionally ex- 
presses the energy carried by rays traveling along a given direction, per unit 
area, frequency, solid angle, and time; it thus normally has dimensions ergs s _1 
cm~ 2 sr _1 Hz -1 (see [110] for a pedagogical introduction to radiative transfer). 
However, for many applications of radiative transfer in an expanding universe, 
the units cm -2 s _1 Hz" 1 sr _1 are more convenient, because photon number 
is conserved during the expansion but energy is not; we will often work in the 
latter units in this review. We will also usually slightly simplify the problem 
by using the angle-averaged specific intensity J v — J I u dQ. 

For convenience, we will quantify l v by the equivalent brightness temperature, 
Tb(u), required of a blackbody radiator (with spectrum B v ) such that I v = 
B u (Tb). Throughout the range of frequencies and temperatures relevant to the 
21 cm line, the Rayleigh- Jeans formula is an excellent approximation to the 
Planck curve, so that T b (u) as /„ c 2 /2k B v 2 , where is c is the speed of light and 
ks is Boltzmann's constant. 

We will be almost exclusively interested in the brightness temperature of the 
HI 21 cm line, which has rest frequency uq = 1420.4057 MHz. Because of the 
cosmological redshift, the emergent brightness I&(z/ ) measured in a cloud's 
comoving frame at redshift z creates an apparent brightness at the Earth 
of Tb(v) = T 5 '(z/ )/(l + z), where the observed frequency is v — ^o/(l + z). 
Similarly, the brightness temperature of the CMB in a comoving frame at 
redshift z scales from the presently observed value of T 7 (0) = 2.73 K to T^{z) = 
2.73 (1 +z) K. 6 

An individual gas cloud produces a flux (erg cm -2 s _1 ) at the Earth of S — 
/cloud Iv cos ^ df2 dv, where the integral extends over the solid angle subtended 
by the cloud and the frequency spread of the signal. In most of our applications 
we will instead consider the frequency-dependent incident energy flux. Radio 
astronomers typically measure this frequency- dependent flux density, S u , in 
Janskys (1 Jy = 10~ 23 erg s _1 cm -2 Hz -1 ). The apparent angle 9 between the 
cloud centroid and the element of solid angle dTt is generally small (sin 9 ~ 9) 
in the applications considered here. Thus for uniform brightness clouds with 
small apparent angular diameter, a convenient conversion is 
2/c B T b z/ 2 Af2/c 2 , where all quantities are measured in the observer's frame. 
Note that, once the cosmological redshift and the variation of du with distance 
are included, the flux scales like S oc Dj 2 , while the flux density scales as 
S v oc (1 + z)D~ L 2 . 

In the Rayleigh- Jeans limit, the equation of radiative transfer along a line of 
sight through a cloud of uniform excitation temperature T ex states that the 



6 Henceforth we will drop the prime in T 7 (z) for notational convenience. 
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emergent brightness at frequency v is 

T' h {v) = T cx (l - e~ T ") + T»e"- (10) 



where the optical depth r u = J ds a u is the integral of the absorption coefficient 
(a v ) along the ray through the cloud, T' R is the brightness of the background 
radiation field incident on the cloud along the ray, and s is the proper distance. 

For the 21 cm transition, the excitation temperature T ex is referred to as the 
spin temperature T s . It quantifies the relative number densities, rij, of atoms 
in the two hyperfine levels of the electronic ground state (we will use the 
subscripts 1 and to denote the triplet and singlet states, respectively; these 
equal the total angular momentum F of the atom). It is defined via 

Vl. = 9l e -Eio/k B T s = 3 e -n/T s 
n go 



where gt is the statistical weight (here g = 1 and g 1 — 3), E 10 = 5.9 x 10" 6 eV 
is the energy splitting, and T* = E w /kB = 0.068 K is the equivalent temper- 
ature. Because all astrophysical applications have Ts ^> T*, approximately 
three of four atoms find themselves in the excited state. As a result, the ab- 
sorption coefficient must include a correction for stimulated emission (and 
hence it depends on T s as well). Note that, in detail, the assumption of a sin- 
gle Ts applying to the entire hydrogen distribution is not necessarily correct. 
Rigorously, one should solve a Boltzmann equation that couples the spin and 
velocity distributions [111]. When the collision time is long, this introduces 
percent level changes to the brightness temperature. 

The optical depth of a cloud of hydrogen is then: 

r v = J dsaoi (1 - e - Ew/kBTs ) 4>{v) n (12) 

hv 
k B T s 



where 



A w = 2.85 x 10~ 15 s _1 is the spontaneous emission coefficient of the 21 cm 
transition, Nm is the column density of HI (here the factor 1/4 accounts for 
the fraction of HI atoms in the hyperfine singlet state), and 4>{v) is the line 
profile (defined so that / &v <fi{v) = 1). The second factor in equation (12) 
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accounts for stimulated emission. The approximate form in equation (13) as- 
sumes uniformity throughout the cloud. 

In general, the line shape <f>(i/) includes natural, thermal, and pressure broad- 
ening, as well as bulk motion (which increases the effective Doppler spread). 
Our most important application is to IGM gas expanding uniformly with the 
Hubble flow. Then the velocity broadening of a region of linear dimension s will 
be AV ~ sH(z) so that <f>(v) ~ c/[sH(z)v]. The column density along such a 
segment depends on the neutral fraction xm of hydrogen, so Nm = xm.nH(z) s. 
A more exact calculation yields, with equation (12), an expression for the 21 
cm optical depth of the diffuse IGM, 



3 hc 3 Ai x m n H 
Tvo ~ 32vr k B T s vl (1 + z) (d^n/dry) 1 j 



: 0.0092 (1 + 5) {1 + zf 2 ^- 

J-s 



H(z)/(l + z) 
dv\\/dr\\ 



(16) 



where in the second equality T s is in degrees Kelvin. Here the factor (1 + 5) 
is the fractional overdensity of baryons and d^y/dry is the gradient of the 
proper velocity along the line of sight, including both the Hubble expansion 
and the peculiar velocity. In the second line, we have substituted the velocity 
H(z)/(1 + z) appropriate for the uniform Hubble expansion at high redshifts. 

The two applications of equation (10) that will be most important here are: 

1. The contrast between high-redshift hydrogen clouds and the CMB. Many 
of the observational strategies for the 21 cm line involve comparison of lines 
of sight through a cloud 7 to (sometimes hypothetical) sightlines with clear 
views of the CMB. Thus we hope to measure 



8T b (v) = 



Ts-T^z) 



(1 



°) 



1 + z 

9 x m (l + 5) (1 + z) 1 ' 2 



T s - T 7 Q) 
1 + z 



I u 



H{z)/{l + z) 
di>||/dr|| 



mK. 



(17) 
(18) 



Note that STj, saturates if Ts ^> T 7 , but it can become arbitrarily large (and 
negative) if Ts T 7 . The observability of the 21 cm transition therefore 
hinges on the spin temperature; we will describe below the mechanisms that 
drive T s either above or below T 7 (z), which dictate whether the 21 cm signal 
will appear in emission, absorption, or not at all. 



7 Here we use "cloud" to refer to any patch of the IGM; it need not be physically 
distinct from the surrounding gas. 
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2. Absorption against high redshift radio sources (§ 10). The brightness tem- 
peratures of nonthermal radio continuum sources (T src pa 10 6 — 10 10 K) far 
exceed T$ and T 7 , so the flux density received from the direction of a high 
redshift radio source is S v ph S src exp(—T u ). High-redshift radio-loud quasars 
or radio galaxies would make superb probes of cloud structure in the neutral 
or partially reionized IGM through their absorption line spectra. 

Three competing processes determine T5: (1) absorption of CMB photons (as 
well as stimulated emission); (2) collisions with other hydrogen atoms, free 
electrons, and protons; and (3) scattering of UV photons. We let C w and 
Pio be the de-excitation rates (per atom) from collisions and UV scattering, 
respectively; they will be examined in detail in the following sections. We also 
let C01 and P m be the corresponding excitation rates. The spin temperature 
is then determined in equilibrium by 8 

ni (C10 + Pio + A 10 + B w I C mb) = n (C„i + P01 + B Jcmb) , (19) 



where B 01 and B 10 are the appropriate Einstein coefficients and I cub is the en- 
ergy flux of CMB photons. With the Rayleigh- Jeans approximation, equation 
(19) can be rewritten as [67] 

T -i = T- 1 + x c T K l + x a T-^ (2q) 



where x c and coupling coefficients for collisions and UV scattering, 

respectively, and T K is the gas kinetic temperature. Here we have used detailed 
balance through the relation 

° 01 ?±e- T */ T « « 3 ( 1 - I* V (21) 



C10 go V Tj 



K 



We have then defined the effective color temperature of the UV radiation field 
T c via 



^-3(l-|). (22) 



The goal of the next two sections will be to calculate x c , x a , and T c . In the 
limit in which T c — > T K (a reasonable approximation in most situations of 



8 Note that the relevant timescales are all much shorter than the expansion time, 
so equilibrium is an excellent approximation. 
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interest, as we will see in §2.3), equation (20) may be written 



x _ | _ zJ_ j (23) 

T 5 l+x c + x a V Tft:,' 



£.2 Collisional Coupling 



We will first consider collisional excitation and de-excitation of the hyperfine 
levels, which dominate in dense gas. The coupling coefficient for species i is 

/llO i 7 ^-10 J 7 



where ^ is the rate coefficient for spin de-excitation in collisions with that 
species (with units of cm 3 s _1 ). The total x c is the sum over all %. We next 
show how these rates are calculated. Readers who are not interested in the 
details can simply use the final results presented in Figure 2 and Tables 3-4. 



2.2.1 H-H Collisions 

The role of hydrogen- hydrogen collisions in setting the spin temperature has 
received extensive attention in the literature, dating back to the first discus- 
sions of the spin temperature [67,112-118]. The dominant interaction is elec- 
tron (and hence spin) exchange in atomic collisions. There are several types 
of permissible spin-exchange collisions. Let us label the four hyperfine states 
by "a" for F = and "b, c, d" for F = 1, m = (-1, 0, 1), where m is the 
projection of the spin. If A and B label the two atoms, electron exchange must 
obey the conservation law 

m A + m B = mA> + rn B i. (25) 



One possible set of interactions has AF = 2: cc-^aa, bd-^aa, and db^aa; we 
denote their cross sections with a + . Another set has AF = 1: bd— >ac, db^ac, 
cd^ad, dc^ad, bc-^ab, cb^ab (and their inverses), which we denote with 
a~ . Finally, we let AF = transitions (bd^cc, db— *cc, and their inverses) 
have cross section a . 

We will now sketch how to compute /t™, the effective rate coefficient for 
these processes; more rigorous treatments can be found in the literature [113, 
116-118]. Although a semi-classical approach is adequate at high tempera- 
tures [112], a full quantum-mechanical treatment is necessary in the general 
case. The problem can be described as a typical scattering event between 
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two identical particles that form an intermediate (virtual) hydrogen molecule 
before separating again. 

Actually, an analogous calculation that ignores nuclear symmetry serves to 
illustrate the main principles [113]. The Schrodinger equation for the total 
wavefunction X of both atoms is 

(h - ± V 2 R - E) X(r, R) = 0, (26) 

where r represents the positions of the two electrons, R is the vector joining 
the two nuclei, M is the reduced mass, E is the total energy, and H is the 
Hamiltonian of the system when the nuclei are fixed in space. The lowest 
energy eigenstates of H are Xs and Xt, the singlet and triplet states of the 
hydrogen molecule ( 1 S 9 and 3 S n , respectively). For slow collisions, the total 
wave function can then be written as a superposition of these two states 

X(r, R) = F S (R) Xs (r; R) + F t (R) Xt (r; R), (27) 

where F s t are determined as follows. As usual for scattering problems, in the 
elastic scattering limit (valid for Tk 7*), the asymptotic solutions must 
take the form (e.g., [119]) 

p ikR 

F Sit (R)~e ik - z + 1 ^/ M (M), (28) 

where k is the relative momentum and (R, 6, 0) are the spherical coordinate 
components of R. The angular dependence (and hence the functions f Stt ) can 
be found by substituting the form (27) into equation (26) and expanding in 
Legendre polynomials P;(cos#). This reduces the problem to an infinite set of 
radial equations, indexed by the order / of the associated Legendre polynomial 
and known as partial wave equations. It can be shown that the solution has 
the form 

1 oo 

Ut(k,0) = -E(2/ + l)e^' t sin^P i (cos^). (29) 



The phase shifts <5f '* quantify the coherence of the scattering amplitudes over 
the different partial waves. The particulars of the scattering problem enter the 
solution only through these phase shifts, and they are ultimately determined 
by the H 2 energy potential curves in the singlet and triplet states. The total 
cross section averages over the spins of the particles; the spin exchange cross 
section, on the other hand, is a coherent sum oc \f t — f s \ 2 - 
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The only difference for identical particles is that the wavefunction X must be 
made antisymmetric with respect to interchange of the two nuclei [116]. The 
resulting cross sections are [118] 
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= IP + 1) sin 2 (5? - $[1 - (-l) i+1/2±1/2 ], (30) 



and a = <j + . The factor in square brackets results from nuclear symmetry; it 
would be unity for distinguishable particles (and hence there would be only 
one cross section). The most recent evaluations of o r± using state-of-the-art 
molecular potentials and the close-coupling method can be found in [118]; 
they are accurate to < 5% at T K < 3 K (where non-adiabatic effects become 
important), with even smaller errors at higher energies. Note that a~ — > as 
E — > because the ,S-wave scattering term (I = 0) vanishes in equation (30); 
this is a result of nuclear symmetry and does not occur in collisions between 
distinguishable nuclei. 

The cross sections <r ± depend sensitively on E because of resonances with 
the molecular potential. But in real-world applications, it is the thermally 
averaged cross section that is relevant: 

d± = (k B T K )- 2 J dEa ± (E)Ee- E/kBT «, (31) 



which smooths out the structure in a ± . The corresponding rate coefficients 
are 



^-fw^- (32) 



From detailed balance, the excitation rates are 

A;± = A; ± exp(-cu ± ), (33) 



where uj = u = E w /kBT K and u + = 2u. 

With these cross sections in hand, we can compute how the level populations 
evolve, assuming that they are independent of atomic velocities. Collecting all 
the transitions for each state, the rate equations may be written [118] 



nd = n b = n 2 a k+ + 2n a n c k x - 2n b n d (k + + k ) +n 2 c k" 
h c = n 2 a k+ + 2k~ n a (n b -n c + n d ) - 2n b n c k~ 
+2n b rid (k~ + k + ) — 3n 2 k + — 2n c rid k~ 
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h a = -3n 2 a k+ - 2k x n a (n b + n c + n d ) + 2n b n c k 
+2n b n d (k~ + k + ) + n 2 c k + + 2n c n d k~ , 



(34) 



where we have set k° ~ k + at the temperatures of interest. In most situa- 
tions, the level populations will be near thermodynamic equilibrium, so we 
can linearize equations (34) about that point. Writing n x = n b + n c + n d 
and n = n a , and assuming statistical equilibrium among these sublevels, the 
linearized form is 

TTTT MFT /OP-\ 

m = n n 01 n m - ni«; 10 n m , (35) 
where we have assumed n a e~~ w ~ ni/3 and defined 



to be the effective de-excitation rate. Note that the reduction to equation (35) 
is not trivial, because equations (34) are nonlinear in the level densities. 
Some earlier calculations did not linearize properly and so underestimated 
/t™ [116,117]. Fortunately, the linearization is sufficiently accurate that this 
single effective rate equation (as opposed to the full nonlinear system) is ac- 
curate throughout the regime of interest [118]. However, when collisions dom- 
inate but are still relatively weak, the assumption that ni/rio is independent 
of atomic velocity is not always a good one at the percent level [111]. 

Table 3 lists values for «™ from [118] (for T K < 300 K) and from [120] (for 
higher temperatures). For Tk > 5000 K, the latter uses an extrapolation for 
the cross section at high /, so the values are only approximate in this regime 
(however, at such high temperatures collisions with hydrogen atoms rarely 
dominate anyway). At even higher temperatures, excitations to higher atomic 
levels begin to dominate and this calculation breaks down. Figure 2 shows 
the rate coefficient graphically. It decreases rapidly at small temperatures 
but varies rather gently with temperature at Tk > 50 K. Note that [118] did 
not assume elastic collisions and so is valid to smaller energies than previous 
estimates. 

We end our discussion of H-H interactions with a caveat about the use of 
equation (35) [118]. Its derivation requires statistical equilibrium for the F = 1 
magnetic sublevels. Unfortunately, collisions cannot establish such an equilib- 
rium on their own (for example, h b — h d = is always obeyed). We must 
rely on two other mechanisms to depolarize these populations: radiative tran- 
sitions and dipolar spin-spin interactions. The latter are long-range magnetic 
forces suppressed by a factor ~ (f/c) 2 and are ineffective in the IGM. Fortu- 
nately, absorption of CMB photons does efficiently mix the levels. The weak 
primordial polarization of the CMB [43,121,122] may leave the atoms slightly 



K ™ = (A;+ + Ar)/2 = /4 H e 73 



(36) 



28 



T 1 (\V\ 


Atf H (cm 3 s" 1 ) 




k^ h (cm 3 s" 1 ) 


1 


1.38 x 


1 n — 13 
10 


on 

80 


1.02 x 


i n— 10 
10 


r> 

Z 


1.43 x 


1 n— 1 3 
10 " 


90 


1.11 X 


1 n— If) 

10 u 


4 


2.71 x 


1 n — 13 
10 


100 


1.19 x 


1 n — 10 
10 


6 


6.60 x 


1 n — 13 
10 


200 


1.75 x 


i n — 10 
10 


o 



1.47 x 


1 n — 12 
10 


oUU 


2.09 x 


i n— 10 
10 


ID 


2.88 x 


i n — 12 
10 


5UU 


2.56 x 


-, A _10 
10 


15 


9.10 x 


1 n— 1 2 
10 




2.91 x 


1 n— 10 

10 u 


on 

20 


1.78 x 


1 n — 11 
10 


1000 


3.31 x 


1 n — 10 
10 


25 


2.73 x 


1 n— 11 
10 


nnnn 

2000 


4.27 x 


1 o— 10 

10 


30 


3.67 x 


1 n— 11 
10 


3000 


A C\'~7 , 

4.97 x 


i n— 10 
10 


4U 


5.38 x 


KT 11 


rnnn 


6.03 x 


10 -io 


50 


6.86 x 


KT 11 


7000 


6.87 x 


10 -io 


60 


8.14 x 


KT 11 


10000 


7.87 x 


io- 10 


70 


9.25 x 


ht 11 









Table 3 

De-excitation cross sections for H-H collisions. Data for Tk < 300 K is from [118]; 
higher temperature data is courtesy K. Sigurdson (using the methods of [120]). The 
cross sections at Tk > 5000 K are only approximate (see text). 

polarized, but its effects on the spin temperature are likely to be extremely 
small. 

2.2.2 H-e~ Collisions 

Collisions between neutral hydrogen atoms and free electrons can also induce 
spin exchange. The cross section can be computed by considering the scatter- 
ing of a polarized beam of electrons by hydrogen [67]. This scattering problem 
can be solved with the same methods as for H-H collisions (eqs. 28-32, but 
with distinguishable particles). Of course, in this case the scattering problem 
is a three-body one and so is more difficult to solve numerically. But, just 
as before, spin exchange is driven by the differing triplet and singlet scatter- 
ing amplitudes (f t and f s , respectively), and the spin exchange cross section 
is oc \ft — fs\ 2 , which reduces to the form of equation (30) - except for the 
absence of the nuclear symmetry factor, of course. 

The first cross section calculation [67] used the phase shifts of [123] , while the 
next [116] used the much more accurate phase shifts of [124] (also see the fit 
of [125], which is accurate to ~ 7%). These early calculations included only 
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1 10 10 2 10 3 10 4 

T K (K) 

Fig. 2. De-excitation rate coefficients for H-H collisions (solid line) and H-e~ colli- 
sions (dashed line). Note that the net rates are also proportional to the densities, 
so H-H collisions still dominate in a weakly-ionized medium. 

the / = term in equation (30), corresponding to S-w&ve scattering. Since 
then, H-e~ scattering has received a great deal of attention in the atomic 
physics literature, and updated spin de-excitation rate coefficients including 
all the I < 3 partial waves have now been presented [126]. We summarize the 
results in Table 4 and by the dashed curve in Figure 2. Note that 3> «f H 
because (at a fixed temperature) free electrons have much larger velocities 
than hydrogen atoms. Moreover, (unlike for H-H) the H-e" cross section does 
not cut off at small temperatures. 

Nevertheless, H-H collisions typically dominate in the early Universe, because 
the relic ionized fraction from cosmological recombination is x« ~ 2 x 10~ 4 
(see §3.1). Only if the Universe is significantly heated and ionized can colli- 
sions with electrons become a strong coupling mechanism (see §7.2). Thus the 
high temperature behavior of k^q is relatively important. The usual calcula- 
tion (given in Table 4), which neglects scattering at energies above the n = 2 
threshold, breaks down at T K > 1.5 x 10 4 K. Above that point, excitations, 
ionizations, and high-energy elastic scattering will all affect . However, in 
practice probably becomes irrelevant before that point. H-e~ collisions 
begin to excite the 2P level of hydrogen at Tk > 6200 K [126]. 9 Because 
each Lya photon generated through this process scatters ~ 10 5 times before 



9 In the presence of an X-ray background, which produces a non-thermal population 
of high-energy photons, this threshold temperature can be much smaller [127-129]. 
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Table 4 

De-excitation cross sections for H-e~ collisions, including only collisions below the 
n = 2 threshold. Higher energy collisions become important at T K > 2 x 10 4 K. 
Note that, at Tk > 6200 K, the Lya background generated by collisions can no 
longer be neglected. From [126]. 

redshifting out of resonance, while only ~ 10% of collisions lead directly to de- 
excitation, the radiation background generated by collisional excitation com- 
pletely dominates the spin temperature coupling at higher temperatures. Thus 
the Wouthuysen-Field effect, to be described in the following section, cannot 
be ignored once T K > 6200 K, even in the absence of luminous sources. 



2.2.3 Other Species 

Neutral hydrogen atoms can also collide with bare protons, deuterium atoms, 
and helium atoms or ions. Proton collisions are generally subdominant: n% ~ 
3.2k™ for relatively high temperatures [116], making them somewhat less 
efficient than the accompanying free electrons. Neutral helium has a closed 
shell of electrons in the ground state, so the Pauli exclusion principle prevents 
electron exchange with it from causing spin change unless the helium atom 
can be excited to the triplet state (requiring significantly more energy than 
the cold neutral IGM can provide; see [111] for a detailed dicussion). Ionized 
helium avoids this problem and may be significant in partially ionized gas 
(though the accompanying free electrons will still dominate because of their 
larger velocities). To our knowledge, these rates have not yet been calculated. 

Finally, we have collisions with trace elements. Spin exchange cross sections 
in H-D collisions have been evaluated by [120] (see §2.6). Although they are 
much larger than the corresponding H-H cross sections at small temperatures, 
their rarity means that they still have no significant effect on Tg. 
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0°l/2 



Fig. 3. Level diagram illustrating the Wouthuysen-Field effect. We show the hy- 
perfine splittings of the IS" and 2P levels. The solid lines label transitions that 
mix the ground state hyperfine levels, while the dashed lines label complementary 
transitions that do not participate in mixing. From [130]. 

2.3 The Wouthuysen-Field Effect 



A less obvious coupling process has become known as the Wouthuysen-Field 
mechanism 10 [66,67]. It is illustrated in Figure 3, where we have drawn the 
hyperfine sublevels of the IS and 2P states of HI. Suppose a hydrogen atom in 
the hyperfine singlet state absorbs a Lya photon. The electric dipole selection 
rules allow AF = 0, 1 except that F = — > is prohibited (here F is the 
total angular momentum of the atom). Thus the atom will jump to either 
of the central 2P states. However, these rules allow this state to decay to 
the 15*1/2 triplet level. 11 Thus atoms can change hyperfine states through 
the absorption and spontaneous re-emission of a Lya photon (or indeed any 
Lyman-series photon; see §2.4 below). This is analogous to the well-known 
"Raman scattering" process, which often determines the level populations of 
metastable atomic states, except that in this case the atom undergoes a real 
(rather than virtual) transition to the 2P state. 



10 As a guide to the English-speaking reader, "Wouthuysen" is pronounced as 
roughly "Vowt-how-sen," although in reality the "uy" construction is a diphthong 
with no precise counterpart in English. 

11 Here we use the notation fLj, where L and J are the orbital and total angular 
momentum of the electron. 
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2.3.1 An Approximate Treatment 



We begin with a relatively simple and intuitive treatment of this process. 
Reality is considerably more complicated; we discuss more precise calculations 
in §2.3.3 below. The Wouthuysen-Field coupling must depend on the total rate 
(per atom) at which Lya photons are scattered within the gas, 

P a = 4irx a J dv J u (is)(j) a (is), (37) 



where a v = x«0a( z/ ) is the local absorption cross section, Xa = (tt e 2 /m e c)f a , 
f a = 0.4162 is the oscillator strength of the Lya transition, <p a {v) is the Lya 
absorption profile, and J u is the angle-averaged specific intensity of the back- 
ground radiation field (by number, not energy). In the simplest approximation, 
we simply assume J u to be constant across the line; we will see in §2.3.3 that 
this is often not a valid assumption. The line often has a Voigt profile (if 
only thermal and natural broadening are relevant). For concreteness, thermal 
broadening yields a Doppler width 

Av D = <[*^v a , (38) 



where u a = 2.47 x 10 5 Hz is the central Lya frequency. 

Our goal here is to relate this total scattering rate P a to the indirect de- 
excitation rate P w [67,68,131]. In this section we will make the simplifying 
assumption that J„ is constant across the Lya transition. We first label the IS 
and 2P hyperfine levels a-f, in order of increasing energy, and let Aij and Bij be 
the spontaneous emission and absorption coefficients for transitions between 
these levels. We write the background flux at the frequency corresponding to 
the % — > j transition as Jij. Then 

-Poi oc B ad J ad - — -^-j h B^J^— — ~~z • (39) 



The first term contains the probability for an a^d transition (B^Jad), to- 
gether with the probability for the subsequent decay to terminate in state b; 
the second term is the same for transitions to and from state e. Next we need 
to relate the individual A^ to A a = 6.25 x 10 8 Hz, the total Lya sponta- 
neous emission rate (averaged over all the hyperfine sublevels). This can be 
accomplished using a sum rule stating that the sum of decay intensities (giAy) 
for transitions from a given nFJ to all the n'J' levels (summed over F') is 
proportional to 2F + 1 (e.g., [132]); the relative strengths of the permitted 
transitions are then (1, 1, 2, 2, 1, 5), where we have ordered the lines (be, ad, 
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bd, ae, be, bf) and the two letters represent the initial and final states. With 
our assumption that the background radiation field is constant across the in- 
dividual hyperfine lines, we then find P w = (4/27)P Q [67,68] (see [131] for a 
detailed derivation). 

The coupling coefficient x a may then be written 
AP T J 

Xa = _H°L ±± = Sa ^, (40) 
27A 10 T 7 J c v ' K J 



where in the second equality we evaluate J u = J a neglecting radiative transfer 
effects and set J c v = 1.165 x 10- 10 [(1 + z)/20] cm" 2 s" 1 Hz" 1 sr" 1 . As we 
will see below, the background spectrum around Lya is nontrivial, so we must 
include a correction factor S a that accounts for variations near the line center. 
This coupling threshold for x a = S a can also be written in terms of the number 
of Lya photons per hydrogen atom, which we denote J£ = 0.0767 [(l+,z)/20]~ 2 . 
As we will see in §3, it is relatively easy to achieve in practice. 

The remainder of this section will show how to calculate T c and S a . For the 
reader who is not interested in the details, equations (52) and (55) provide the 
basic results that can be used to compute Ts (see also [133]). Note, however, 
that both T c and S a implicitly depend on T s , so all three must be solved for 
simultaneously. 



2.3.2 The Color Temperature 

The Lya coupling also depends on the effective temperature T c of the UV 
radiation field, defined in equation (22). This is determined by the shape of 
the photon spectrum at the Lya resonance. That the effective temperature 
of the radiation field must matter is easy to see: the energy defect between 
the different hyperfine splittings of the Lya transition implies that the mixing 
process is sensitive to the gradient of the background spectrum near the Lja 
resonance. More precisely, the procedure described near equation (39) lets us 
write 

Pw 9o n hd + n he V dis / 



where n u = c 2 J u /2u 2 is the photon occupation number. Thus by comparison 
to equation (22), we have [98] (neglecting stimulated emission [134]) 

h = rilnn„ 

k B T c dis ' 1 ] 
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Note that much of the literature uses a slightly different definition of the 
color temperature (in terms of J u ) that does not obey detailed balance (e.g., 
[133, 135]). Obviously the color temperature is a function of frequency; in 
detail, it should be harmonically averaged over the line profile [136], but this 
makes only a tiny difference in practice [135] . Note as well that slightly different 
definitions of T c can be useful for certain applications [136]. 

Simple arguments show that T c Tr-: all boil down to the observation that, 
so long as the medium is extremely optically thick, the enormous number of 
Lja scatterings must bring the Lja profile to a blackbody of temperature T K 
near the line center [66]. This condition is easily fulfilled in the high-redshift 
IGM, where in our cosmology the mean Lja optical depth experienced by a 
photon that redshifts across the entire resonance is [137] 



Field [138] showed that atomic recoil during scattering, which tilts the spec- 
trum to the red, is primarily responsible for establishing this equilibrium. 
However, as we will see, spin exchange slightly modifies this expectation. We 
will examine the solution in more detail below (see eq. 52). 

2.3.3 The Radiation Field 

The scattering process is actually much more complicated than naively ex- 
pected for two basic reasons: (1) scattering itself modifies the shape of J u , 
and (2) the hyperfine sub levels must be taken into account. Recent treat- 
ments have incorporated both these effects; the former is particularly impor- 
tant [133-136,139,140]. Intuitively, a flat input spectrum develops an absorp- 
tion feature because of the increased scattering rate near the Lya resonance. 
Photons everywhere continually lose energy by redshifting, but they also lose 
energy through recoil and spin excitation whenever they scatter. If the drift 
is denoted A, continuity would require ny^4=constant (assuming photons are 
neither created nor destroyed in the line and neglecting diffusion from re- 
emission); when A increases near resonance, the number density must fall. On 
average, the energy loss (or gain) per scattering is [134] 



where the first factor comes from recoil off an isolated atom [98] and the sec- 
ond factor corrects for the distribution of initial photon energies [134,135]; 
note that it vanishes as T c — > T K , which also separates the heating and cool- 
ing regimes. The small energy defect between the hyperfine levels provides 




(43) 




(44) 
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another (weak) source of energy exchange [133,139]. The latter process can 
be incorporated into the scattering in nearly the same way as recoil. 



Our main goal in this section is to show how to compute the photon spectrum 
near Lja. We begin with the radiative transfer equation in an expanding 
universe (written in comoving coordinates, neglecting stimulated emission, 
and using photon number rather than energy for J J) [136, 141]: 

1 (97 (97 r 

1 -<j> a (v)J„ + His a -^- + j di/ R(u,i/) J u , + C(t)4>(v). (45) 



CTlHXa dt 



Here the first term on the right-hand side describes absorption, the second 
the Hubble flow, and the third re-emission following absorption. R(u, v') is the 
"redistribution function" that describes the frequency of an emitted photon, 
which depends on the relative velocities and directions of the absorbed and 
emitted photons as well as the absorbing atom [142]. The last term describes 
injection of new photons: C is the rate at which they are produced and ip{u) 
is their frequency distribution. 

The redistribution function R is the complicated aspect of the problem, and 
general solutions to equation (45) are not available. Fortunately, it can be 
simplified if the frequency change per scattering (typically of order Auo [142]) 
is "small" (see below). In that case, we can expand J v > to second order in {y — 
u') and rewrite equation (45) as a diffusion problem in frequency. The steady- 
state version of equation (45) becomes, in this Fokker-Planck approximation 
[133, 134], 12 

1(- aj+v £) +c ^=°- (46) 



where x = (v — v a )/Ai> D , A is the frequency drift, and V is the diffusivity. 
The problem is not, unfortunately, completely specified, because V is con- 
structed from an approximation to R and so depends on the physical pro- 
cesses incorporated. The resulting freedom can be dealt with in a number 
of ways [134,141,143,144]. One important caveat (in this particular applica- 
tion) is that the approximation, just like the original problem, should obey 
detailed balance [144], which ensures microscopic reversibility. This was not 
satisfied by the diffusivity chosen in [141], which has been used in most recent 
work [98,133,135,139]. A "corrected" form was recently presented by [134]. An 
alternative approach is to expand the scattering probability itself, rather than 
J v or n v [136]. This appears to give somewhat less accurate results but has 
some formal advantages. In general, these Fokker-Planck approximations are 

12 Note that we neglect stimulated emission here, which adds a term oc J 2 inside 
the brackets [134]. 
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valid when (i) the frequency change per scattering (~ Az/q) is much smaller 
than the width of any spectral features, and either (iia) we are outside the 
line core, where &4> a /<lx is small, or (iib) we are in equilibrium with T c pa T K . 
Fortunately, Monte Carlo simulations have verified the utility of equation (46) 
even when these conditions are marginally violated [133]. 

Solving for the background spectrum thus reduces to specifying A and T> for 
each of several processes. The first is the Hubble flow, which causes a drift 
Ah = —7 (without any associated diffusion), where 7 is the Sobolev parame- 
ter that parameterizes the bulk velocity of the medium. In our case, neglecting 
peculiar velocities, 7 = TQp. The remaining terms come from expanding R, 
which must incorporate all the physical processes relevant to energy exchange 
in scattering. The commonly used Rn, which describes frequency redistribu- 
tion if the scattering is coherent in the rest frame of the atom [142, 145], will 
not suffice, because we must include recoil and spin exchange as well. The 
drift from recoil is [133, 134, 139, 141, 143] 



^scatt = 0a(x)/2, (47) 

A scatt = -(r] - Xq 1 )^^), (48) 

where x = v a j 'Au D and r\ = (hu^) / \m p c 2 Au D ). The latter is the recoil pa- 
rameter measuring the average loss per scattering in units of the Doppler 
width. The second term in equation (48) follows from detailed balance and 
ensures the proper approach to thermal equilibrium [134]. When Tk is small, 
we must also include energy loss during spin exchange [133, 139], which has 
T> sc pa (2z/q/9Az/|)) "Dscatt because the frequency change from spin exchange is 
±z/ and only a fraction 2/9 of absorptions actually lead to spin exchange. 
When T K is extremely small, each hyperfine transition should be treated 
separately, with distinct Lya line profiles 0io(^) and 0oi(^) for indirect de- 
excitations and excitations of the ground state hyperfine levels. 

To solve equation (46) we must specify the boundary conditions, which es- 
sentially correspond to the input photon spectrum (ignoring scattering) and 
the source function. Because the frequency range of interest is so narrow, two 
cases suffice (e.g., [135,139]). These are a flat input spectrum (which describes 
photons that redshift through the Lya resonance) and a step function, where 
photons are "injected" at line center (through cascades or recombinations; see 
§2.4 below) and redshift away. In either case, the first integral over x is trivial 
and we can write [146] 

0^ + 2(7/0 + 7 ') J = 27'^ (49) 
ax 



where [139] 
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1 ' = 1 (l + T sc /T K )-', (50) 
/ ( 1 + T sc /T s \ x 

where T se = (2/9)T K u^/ Au^ = 0.40 K. The terms involving T se are added 
by spin exchange; note that they become unimportant when the temperature 
is large and are often neglected. The integration constant K equals J^, the 
flux far from resonance, for photons that redshift into the line and for injected 
photons at x < 0; it is zero for injected photons at x > 0. This form is not 
accurate when T K ~ T*, in which case a numerical solution must be used [133]. 

When 7 is small (or in the limit of large optical depth, an excellent approxi- 
mation for our purposes) it follows from equations (42) and (49) that 

(52) 



agreeing with the qualitative arguments of §2.3.2 when spin exchange is ne- 
glected (in the limit T s , T K ^> T se ). The spin temperature affects the rate at 
which photons lose energy during scattering (because it determines the level 
populations and hence the probabilities of excitation and de-excitation) and 
so the background spectrum near the Lya resonance, which in turn affects 
the spin temperature itself. Thus Ts must actually be determined iteratively, 
beginning with a guess for T s , calculating T c and the effective Wouthuysen- 
Field coupling strength, computing T s , and repeating. Fortunately the process 
converges rapidly [133]. 

Several solutions to equation (46) have been presented in the literature. We 
begin with the formal analytic solution [140,146]. When K ^ 0, it is most 
compactly written in terms of 5j = (Joo — J)/ Joo- 



5j{x) = 2rj J dyexp 



-2 V 'y - 2i 

x—y 



dx' 

(j)a(x') 



(53) 



An analogous form to equation (53) exists for photons injected at line center. 
To include the full Voigt profile, equation (53) must be solved numerically 
[133,135]. But these equations can be further simplified by approximating the 
full line profile with the Lorentzian wings from natural broadening [139,147]; 
in that case the inner integral is trivial. This assumption is quite accurate 
when T K < 1000 K; at higher temperatures, collisional broadening affects the 
equilibrium spectrum at the several percent level [140]. 

The crucial point of equation (53) is that (as expected from the qualitative ar- 
gument above) an absorption feature appears near the line center; its strength 
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Fig. 4. Background radiation field near the Lya resonance at z = 10, assuming a 
Voigt line profile. The upper and lower sets are for continuous photons and photons 
injected at line center, respectively. (The former are normalized to J^; the latter 
have arbitrary normalization.) The solid and dashed curves take Tk = 10 and 
1000 K, respectively. From [140]. 

is roughly proportional to rj', our recoil parameter. The feature is more sig- 
nificant when Tk is small (or the average effect of recoil is large). Figure 4 
shows some example spectra (both for a continuous background and for pho- 
tons injected at line center). Note that the feature is asymmetric; this will be 
important in §3.2.2. 

For now, the most important result is the suppression of the radiation spec- 
trum at line center compared to the (flat) solution without radiative trans- 
fer. This decreases the total scattering rate of Lya photons (and hence the 
Wouthuysen-Field coupling) below what one naively expects. The suppression 
factor is [135] 

oo 

S a = J dx<j) a (x) J(x) « [1 - 5j(0)} < 1, (54) 

— oo 



where the second equality follows from the narrowness of the line profile. For 
a full Voigt profile, the integral must be performed numerically [133,135]. But 
the wing approximation turns out to be an excellent one; when spin exchange 
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can be neglected, the suppression is [139, 140] 



S a ~ exp 



-1.127/ 



, / 3a 



i/3- 



2ni 



exp 



-0.803T^ 2/3 



10 



-6\ i/3l 



7 



(55) 



where a = A a / (47tAz/d) is the Voigt parameter and in the second equality we 
have neglected spin exchange and expressed TV in degrees Kelvin. Note that 
this form applies to both photons injected at line center and those that redshift 
in from infinity. As we can see in Figure 4, the suppression is most significant 
in cool gas. This is convenient because the Lorentzian wing approximation 
is least accurate at large temperatures; fortunately in that regime S a ~ 1 
anyway, so equation (55) is accurate to a few percent at all T K > 1 K [140]. 
At smaller temperatures, a numerical solution including the Voigt profile, the 
full integral over <p a , and spin exchange, is necessary [133]. 



2.4 The Lyman Series 



The Wouthuysen-Field mechanism describes how Lya photon scattering af- 
fects T s . Of course, the neutral IGM has so much hydrogen that any photon 
redshifting into a Lyman series resonance will be absorbed almost immediately. 
In this section, we will compute the coupling induced by photons that enter 
n > 1 Lyman lines [130, 133]. Note that earlier work often ignored the detailed 
atomic physics of these transitions and incorrectly assumed their coupling to 
be identical to Lya photons. 

Suppose that a photon redshifts into the Lyn resonance. After absorption, it 
can either scatter (through a decay directly to the ground state) or can vanish 
if the atom instead decays to an intermediate level. The scattering probability 
PnP^is follows directly from the Einstein coefficients, 

Pit = (56) 



where % and / denote the initial and final states, respectively, and the sum 
is over all allowed final states. Table 5 shows the direct decay probabilities. 
Typically, a Lyn photon will scatter iV scatt pa (1 — P n p-+i5) 1 ~ 5 times before 
being consumed by a decay cascade. As a result, coupling from the direct 
scattering of Lyn photons is unimportant relative to Lja photons [130] , which 
vanish only when they redshift across the line. Thus the coupling from Lyn 
scattering is suppressed by a factor ~ A^ scatt /r GP < 10~ 6 . 

However, Lyn photons can still be important because of their cascade prod- 
ucts. Consider the decay chains shown in Figure 5. For Ly/3, the only permitted 
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Table 5 

Recycling fractions f Tec (n) and direct decay probabilities to the ground state P n p^is 
for the first 29 Lyman-series transitions. From [130, 133]. 



decays are to the ground state (regenerating a Ly/3 photon and starting the 
process again) or to the 2S level. The Ha photon produced in the 3P — > 2S 
transition (and indeed any photon produced in a decay to an excited state) 
escapes to infinity. Thus the atom will eventually find itself in the 2S state, 
which decays to the ground state via a forbidden two photon process with 
As^is = 8.2 s- 1 (other processes, such as collisional excitation, are much 
slower [133]). These photons too will escape to infinity. Thus coupling from 
Ly/3 photons can be completely neglected. 

But now consider excitation by Ly7, also shown in Figure 5. This too can decay 
directly to the ground state or to 2S*. But it can also cascade (through 3S or 
3D) to the 2P level, in which case the original Lyn photon is "recycled" into 
a Lya photon. This photon contributes to the background responsible for the 
Wouthuysen-Field effect just as any other Lya photon would (it is an injected 
photon in the terminology of [135]). Thus the key quantity for determining 
the coupling induced by Lyn photons is the fraction f TCC (n) of cascades that 
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Fig. 5. Decay chains for Ly/3 and ~Lyy. We show Lyn transitions by dashed curves, 
Lya by the dot-dashed curve, cascades by solid curves, and the forbidden 2S* — > IS" 
transition by the dotted curve. From [130]. 

terminate in Lya photons. This can be computed iteratively from [130, 133] 



where the sum is again over all allowed final states. (Note that the sum should 
exclude transitions directly to the ground state, because such decays regener- 
ate the Lyn photon and start the whole process again.) Table 5 lists f rec (n) for 
n < 30. The recycling fractions rapidly approach / rcc 0.36 at large n. Thus 
the Wouthuysen-Field coupling from Lyn photons is about one-third as effi- 
cient as that from Lya photons, with the important exception that / rec (3) = 0. 
We show how to compute the total coupling from all Lyman-series photons in 
53.3. 



2. 5 Polarization 

There are two mechanisms that could induce polarization in the 21 cm line. 
The first is anisotropic excitation of the triplet state. For example, if a sin- 
gle source of Lya photons dominated the Wouthuysen-Field coupling, the 
resulting level populations would be polarized. However, because tqp > 10 5 , 
the coupling is actually produced through repeated scatterings of each Lya 



/rec(i) = £-P*//rec(/), 



(57) 



/ 
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photon. This efficiently isotropizes the Lja radiation field, and this kind of 
polarization is unlikely to be important in practice [148]. 



Magnetic fields can also induce polarization. The Zeeman effect splits the 21 
cm transition into three parts [149,150]: an unpolarized central component 
with frequency u and two components with frequencies u ± Au z , where 



and B\\ is the magnetic field strength along the line of sight. Zeeman split- 
ting leaves the two states with momenta ±h, so conservation of angular mo- 
mentum demands that the higher and lower frequency states be right- and 
left-circularly polarized, respectively. This frequency difference is, of course, 
much too small to be observed directly, but a net signal will be present if 
the difference in the polarized intensities exceeds the detector noise. Because 
the two components are proportional to ST^Uq + Ais) (for right circular) and 
5Tb(i>o — Av) (for left circular), this can happen if there is a strong gradient 
in STj,: then observations corresponding to Uq receive a strong signal in one 
polarization and negligible in the other. Let 5T diS (u) = \5T L {v) — 5T R {v)\ be 
the difference in brightness temperature between the left- and right-handed 
components. Then 



where dSTb/du is the derivative of the brightness temperature (in this patch of 
the sky) with observed frequency. This technique has been used to constrain 
the Milky Way magnetic field [151]. 

Intrinsically polarized 21 cm emission or absorption is one of the few pos- 
sibilities for measuring magnetic fields in the high-redshift IGM (another is 
Faraday rotation of radio- loud quasars). It is an interesting prospect because 
the origin of intergalactic magnetic fields is controversial and essentially un- 
constrained [152]. There are two kinds of possible measurements. First, tar- 
geted observations of regions with strong gradients in the underlying 5Tb can 
constrain their local magnetic fields [153]. For example, quasar HII regions 
typically have edges of width ~ 1. 5x^(1 + z)~ 3 proper Mpc, corresponding 
to Av ~ 2 kHz; the softer spectra of stars yield even narrower edges. Any 
coherent magnetic field across these edges will yield a residual polarization. 
SKA-class instruments could (in the absence of systematics) detect magnetic 
fields of order ~ 100 fxG in these systems [153]. 

A second possibility is a statistical search for large-scale fluctuations in the 
polarized emission or absorption. In this case the constraints are strongest 




(58) 




(59) 
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during epochs in which the globally-averaged brightness temperature changes 
rapidly (see §3 for a discussion of the requisite conditions). However, realis- 
tically the gradients will be modest, so the limits from SKA-class telescopes 
will lie well above theoretical expectations [153]. Moreover, such statistical 
techniques will likely be compromised by the much larger polarization from 
Thomson scattering in the post-reionization Universe (see §4.3). 



2. 6 Other Hyperfine Transitions 



Of course, the HI 21 cm line is only one of many hyperfine transitions. Two 
others may be of particular interest for studying the IGM: the A = 91.6 cm 
transition of deuterium (only recently measured in the Milky Way ISM with 
high statistical signifiance [154]) and the A = 3.46 cm transition of 3 He + (note 
that 4 He has a zero nuclear spin, and neutral 3 He has a closed shell of electrons; 
thus neither has any hyperfine structure). Of course, both of these transitions 
are exceedingly weak because [D/H] ps 3 x 1CT 5 and [ 3 He/H] 1CT 5 according 
to big bang nucleosynthesis [155]. The brightness temperatures corresponding 
to equation (18) are then 



<m,D « 0.079 xm (1 + 5) (J^^J (l " ^) (1 + ^ ^ (60) 

5T b;iHc » 0.53 XHeii (1 + 5) (^plrj (l " (1 + ^) 1/2 »K, (61) 

where of course T$ must be defined separately for D and 3 He and need not 
equal the HI value. For example, the de-excitation rate in D-H collisions is 
many orders of magnitude larger than the corresponding rate for H-H collisions 
at T K < 10 K [120], so D has a much easier time maintaining T S: d 7^ T y than 
H; note that earlier treatments incorrectly ignored this effect [67]. Lyman- 
series coupling also differs between the three species [144] and can lead to 
some surprising effects [139]. These brightness temperatures are so small that 
detecting IGM fluctuations through either of these isotopes will require a truly 
heroic effort. 



However, detecting deuterium before reionization is not completely hopeless, 
because density fluctuations in <5T b D precisely trace those of hydrogen. Thus 
a 21 cm map could be used as a "template," and the correlation coefficient 
between H and D maps could be used to measure their abundance ratio [120]. 
By averaging over the a large fraction of the sky and a large redshift interval 
(and with a significantly larger instrument than any currently planned), we 
could, in principle, detect the deuterium signal at the expected level, although 
systematics may in the end prevent us from reaching the necessary precision. 
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On the other hand, this is the only known method to measure the truly pri- 
mordial [D/H] ratio and would provide an important confirmation of big bang 
nucleosynthesis [155,156]. 

3 He + is also potentially interesting. Hel was likely reionized with HI, but ob- 
servations indicate that Hell is not reionized until z ~ 3 (e.g., [157]). Thus 
it probably emits over a substantial frequency range, v ~ 0.8-2.2 GHz. In 
principle, the 3 He + transition could be detected through its anti-correlation 
with HI during hydrogen reionization; unfortunately, this carries much less in- 
formation than the D-H correlation, because only the ionization pattern (and 
not small-scale density fluctuations) contribute. On the other hand, the sky is 
significantly quieter at these higher frequencies, so (depending on the details 
of reionization) it may end up being an easier experiment [120]. In principle, it 
would also be possible to study Hell reionization with this transition. Unfor- 
tunately, simple sensitivity estimates following [158] show that such attempts 
are well beyond any planned telescope. 



3 The Global Evolution of the IGM 



Now that we have reviewed the underlying physics of the 21 cm transition, we 
will consider how global properties of the IGM evolve through time. These may 
or may not be directly observable (see §3.6), but in any case they constitute 
the background against which the fluctuating signals discussed in the next five 
sections occur. 



3.1 The Dark Ages 



We begin by examining the cosmic dark ages, when the physics is rather 
simple. The first step is to compute how T K evolves with time. Energy con- 
servation in the expanding IGM demands 

^ = - mzY r K + \z^, (62) 



where the first term on the right hand side is the pdV work from expansion 
and €j is the energy injected into the gas per second per unit (physical) volume 
through process i. Before the first nonlinear objects appear, the only relevant 
heating mechanism is Compton scattering between CMB photons and residual 
free electrons in the IGM. The heating rate from this process can be calculated 
from the drag force exerted by the isotropic CMB radiation field on a thermal 
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distribution of free electrons. It is [159, 160] 

2 Ccomp 3Ci (-^7 



3 k B n 1 + / Hc + Xi £ 



(63) 



7 



where t 7 = (3m e c)/(8o"TW 7 ) is the Compton cooling time, w 7 oc T 4 is the 
energy density of the CMB, / He is the helium fraction (by number), and a? is 
the Thomson cross section. The first factor on the right hand side accounts for 
the distribution of the energy over all free particles. Compton heating drives 
Tk — > T 7 when u 1 and Xi are large; thus at sufficiently high redshifts the gas 
can cool no faster than the CMB, T K oc (1 + z). 

Eventually, however, the gas does thermally decouple from the CMB. The 
decoupling redshift can be computed precisely from the publicly available 
RECFAST code [160,161], but a simple estimate provides the main result. 
The recombination rate is h e = —otBxfol, where as oc T K - 7 is the case-B 
recombination coefficient (see §3.4.1 below). The fractional change in Xi per 
Hubble time is therefore 



2 



100^(1 + z) as -^==, (64) 



H(z)n e dt n ' v^V?' 



where we have assumed that Tk oc (1 + z) (i.e., coupling to the CMB is still 
strong). Freeze-out occurs when this is of order unity; at later times Xi remains 
roughly constant because the recombination time then exceeds the expansion 
timescale. Numerical calculations with RECFAST yield x~i = 3.1 x 10~ 4 at 
z = 200 (which is past freeze-out). Inserting this into equation (63), we find 



dT K 10" 



'-K 



H{z)T K dt VL b h 2 



(T,/T K -l)(l + z) 5 / 2 . (65) 



Thus thermal decoupling occurs when 

1 + z dcc w 150 (fi 6 /i 2 /0.023) 2/5 . (66) 



Figure 6a shows a more detailed calculation (from RECFAST). The dashed 
curve shows T 7 and the dotted curve Tk- We see that Compton heating begins 
to become inefficient at z ~ 300 and is negligible by z ~ 150. Past this point, 
T K oc (1 + z) 2 , as expected for an adiabatically expanding non-relativistic gas. 

We must next determine the spin temperature. Obviously x a = during this 
epoch (barring any exotic processes [162]). But at sufficiently high redshifts, 
the Universe was dense enough for collisions with neutral atoms to be efficient 
in the mean density IGM. The effectiveness of collisional coupling can be 
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Fig. 6. (a): IGM temperature evolution if only adiabatic cooling and Compton 
heating are involved. The spin temperature T$ includes only collisional coupling. 
(b): Differential brightness temperature against the CMB for T$ shown in panel a. 



computed exactly for any given temperature history from the rate coefficients 
presented in §2.2. A convenient estimate of their importance is the critical 



overdensity, <5 co ii, at which x c 



1+5, 



coll 



1.06 



«io(88 K) 



Kiq(T K ) 



0.023\ ( 70 s 2 
Q h h 2 ) 



(67) 



where we have inserted the expected temperature at 1 + z = 70. Thus for 
redshifts z < 70, T5 — > T 7 ; by z ~ 30 the IGM essentially becomes invisible. 
It is worth emphasizing that kiq is extremely sensitive to TV 111 this regime 
(see Fig. 2). If the universe is somehow heated above the fiducial value, the 
threshold density can remain modest: 5 co \\ « 1 at z — 40 if T K — 300 K. The 
solid line in Figure 6 a shows the spin temperature T s during the dark ages, 
and Figure 6b shows the corresponding brightness temperature. The signal 
peaks (in absorption) at z ~ 80, where Tk is small but collisional coupling 
still efficient. Because of the simple physics involved in Figure 6, the 21 cm 
line offers a sensitive probe of the dark ages [2], at least in principle. 
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3.2 The Thermal History of the IGM 

Figure 6b also shows that the z < 30 Universe would remain invisible with- 
out luminous sources. Our next task is to consider their effects on the 21 
cm background. We will begin by examining several processes that affect the 
temperature evolution. 

3.2.1 X-ray Heating 

Because they have relatively long mean free paths, X-rays from galaxies and 
quasars are likely to be the most important heating agent for the low-density 
IGM. In particular, photons with E > 1.5xhj 3 [(1 + z)/10] 1 / 2 keV have mean 
free paths exceeding the Hubble length [163]. Given our enormous uncertain- 
ties about the nature of high-redshift objects it is of course impossible to 
describe the high-redshift X-ray background with any confidence (see §7.2 for 
a discussion of current constraints). Our strategy (here and in the remainder 
of this chapter) must therefore be to construct simple, but plausible, models to 
examine the range of possibilities. The most conservative assumption is that 
the local correlation between the star formation rate (SFR) and the X-ray 
luminosity (from 0.2-10 keV) can be extrapolated to high redshift [164-167] 



where fx is an unknown renormalization factor at high redshift. Note that the 
numerical factor depends on the photon energy range assumed to contribute to 
IGM heating; soft photons probably carry most of the energy, but they do not 
penetrate far into the IGM (and hence induce reasonably strong fluctuations 



We can only speculate as to the accuracy of this correlation at higher red- 
shifts. Certainly the scaling is appropriate so long as stars dominate, but fx 
will likely evolve with redshift. The X-ray emission has two major sources. 
The first is inverse-Compton scattering off of relativistic electrons accelerated 
in supernovae. In the nearby Universe, only powerful starbursts have strong 
enough radiation fields for this to be significant; however, at high-redshift s it 
probably plays an increasingly important role because w 7 oc {l + z) A [163]. As- 
suming that ~ 5% of the supernova energy is released in this form [169] yields 
fx ~ 0.5 if ~ 10 51 ergs are released in supernovae per 100 M in star forma- 
tion. The second class of sources, which dominate in locally observed galaxies, 
are high-mass X-ray binaries, in which material from a massive main sequence 
star accretes onto a compact neighbor. Such systems are born as soon as the 
first massive stars die, only a few million years after star formation commences. 




(68) 



[129,168]). 
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So they certainly ought to exist in high-redshift galaxies [167], although their 
abundance depends on the metallicity and stellar initial mass function, and it 
could be especially large if very massive Population III stars 13 dominate. 

X-rays heat the IGM gas by first photoionizing a hydrogen or helium atom. 
The hot "primary" electron then distributes its energy through three main 
channels: (1) collisional ionizations, producing more secondary electrons, (2) 
collisional excitations of Hel (which produce photons capable of ionizing HI) 
and HI (which produces a Lja background), and (3) Coulomb collisions with 
free electrons. The relative cross sections of these processes determine what 
fraction of the X-ray energy goes to heating (fx,h), ionization (fx,ion), and 
excitation (fx,co\\)', clearly it depends on both Xi and the input photon energy. 
Monte Carlo simulations can be used to estimate these fractions, and an ac- 
curate fitting function exists in the high-energy limit [170]. However, a crude 
approximation often suffices [171]: 



(1 + 25-0/3 

fx, ion ~ /x.coll ~ (1 - Xi)/Z. 



(69) 



In highly ionized gas, collisions with free electrons dominate and fx,h — > 1; in 
the opposite limit, the energy is split roughly equally between these processes. 

X-rays directly heat a small fraction of electrons, which must then transfer 
energy to the other particles [98]. The primary photoelectrons, with T ~ 
10 6 K, rapidly cool to energies just below the Lja threshold, < 10 eV, by 
collisionally ionizing and exciting hydrogen atoms before equilibrating with 
the other electrons. After that, the electrons and neutrals equilibrate through 
elastic scattering on a timescale t eq ~ 5[10/(1 + z)] 3 Myr. Because t cq <C 
H(z)^ 1 , the assumption of a single temperature fluid is an excellent one. 

Finally, to relate the X-ray emissivity to the global SFR we will assume (again 
for simplicity) that the SFR is proportional to the rate at which gas collapses 
onto virialized halos, d/ co n/dt (see eq. 7). In that case, we can write 

2 e x _ 1Q 3 R ^ fU_ fx± d/con/d, l + z\ (?()) 



3 k B nH(z) J V - 1 °- 2 0.01 10 



where /* is the star formation efficiency It is immediately obvious that X-ray 
heating is quite rapid; we will examine this in more detail in §3.5. 

Of course, even if equation (68) is accurate, there may be other contributions 
to the X-ray background. Quasars are one obvious example, although their 

13 Henceforth we will abbreviate "Population III" with "Pop III," and similarly for 
Pop II. 
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relevance is far from clear. The known population of bright quasars, extrap- 
olated to higher redshifts, causes negligible heating [172]. But these bright 
quasars may be only the tip of the iceberg: "miniquasars," which are rapidly 
accreting intermediate-mass black holes that may form from the remnants of 
Pop III stars, can strongly affect the thermal history [163, 167, 173, 174]. For 
example, if the "Magorrian relation" between black hole and stellar mass [175] 
(see also [176,177]) holds at high redshifts, the equivalent normalization factor 
would be fx ~ 10. We review existing limits on the X-ray background in §7.2. 

3.2.2 Lya Heating 

In addition to setting the spin temperature, resonant scattering of Lya photons 
can also heat the gas through atomic recoil. The typical energy exchange per 
scattering is small (see eq. 44), but the number of scatterings is large. If the 
net heating rate per atom were ~ P a x (hu a ) 2 /m p c 2 , the gas temperature 
would exceed T 7 soon after Wouthuysen-Field coupling becomes efficient [98]. 

However, the details of radiative transfer radically change these expectations. 
In a static medium, the energy exchange must vanish in equilibrium even 
though scattering continues at (nearly) the same rate. Scattering induces an 
asymmetric absorption feature near v a (Fig. 4) whose shape depends on the 
combined effects of atomic recoils and the scattering diffusivity (see the first 
paragraph of §2.3.3). In equilibrium, the latter exactly counterbalances the 
former. Without scattering, the absorption feature would redshift away; thus 
the equilibrium energy exchange rate is simply that required to maintain the 
feature in place [135]. For photons redshifting into resonance, the absorption 
trough has total energy Au a = (47r/c) f(J<x> — J v )hvdv, where Jqo is the input 
spectrum (thus the integration extends over the dip in Fig. 4). The radiation 
background loses e a = HAu a per unit time through redshifting; this energy 
goes into heating the gas. We therefore have 

2 e a _ 87r hu a Jqq Au d 

3 kBT K nnH(z) 3 k B T K cuh 

where (using v ps u a throughout the region of interest) 

oo 

I c = J dx5j(x). (72) 

— oo 

Analogous expressions exist for photons injected at line center, except that 
here the corresponding integral Ii measures the difference between the actual 
spectra shown in Figure 4 and a spectrum with a step function at line center. 
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(71) 



For an arbitrary line profile these integrals must be performed numerically 
(see [135], although that does not include spin exchange [133, 139] or the 
detailed balance correction [134], so it is only accurate to a few percent). 
However, approximating the line profile with the Lorentzian wings of natural 
broadening turns out to be an excellent one [147, 178] and allows an analytic 
expression (in terms of Airy functions) for I c and a power series expansion for 
Ii [140]. The first few terms are 



q\ 1/3 r g 3 1 / 3 /? 2 3 2 / 3 /? 3 

) [yj [r 2 (2/3) r(i/3)r(2/3) + r 2 (i/3) 



2tt\ 5 / 3 



(73) 



(-0.6979 + 2.4138/5 - 2.7755/3 2 + 2.2724/5 3 ) (74) 



where a = A2i/(47tAz/d), A21 is the spontaneous emission coefficient for the 
Lja transition, and f3 = 7/(40/777') 1/3 ~ 0.99T k 2/3 (tgp/10 6 ) 1/3 (here the ap- 
proximate form ignores spin exchange and the correction for detailed balance). 
When (3 is small and only the first terms contribute (reasonably accurate at 
Tk ^ 10 K for I c but not as good for ij), we expect I c oc T K 5 ^ e (l + z) and 
I t ^T K 1/6 (l + z). 

Although I c is always positive (an absorption trough always forms), ij is neg- 
ative at most temperatures (T K > 4 K) [135]. Given enough time, the gas 
would therefore approach an equilibrium temperature determined by com- 
petition between heating from continuum photons and cooling from injected 
photons [178]. The relevant parameter is the fraction of injected photons at 
the Lya resonance; for Pop II and very massive Pop III stars, this is 6% and 
12%, respectively [130]. However, both heating and cooling turn out to be 
extremely slow and this equilibrium is never reached in practice. We find that 
the continuum heating rate is [140] 



0.80 x a ( 10 



3Hn H k B T K T ^S a \l + z 



(75) 



which is usually negligibly small compared to X-ray heating. The reason is that 
the scattering diffusivity acts to cancel the effects of recoil. From Figure 4, it 
is obvious that the background spectrum is weaker on the blue side of the 
line than on the red. Scattering tends to return the photon toward line center, 
with the extra energy deposited in or extracted from the gas. Because more 
scattering occurs on the red side, this tends to transfer energy from the gas 
back to the photons, canceling the recoil exchange. We refer the interested 
reader to [134, 136] for a more formal explanation. 

Of course, so far we have assumed steady-state (dJ v /dt = 0); obviously the 
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heating rate will be much faster before equilibrium is established. But the total 
energy transferred to the gas during this period is simply Au a , the energy lost 
in producing the absorption feature of Figure 4. Thus the total energy transfer 
is essentially the same as in our equation (75): this is because, although the 
heating rate itself is large, only a few scattering times are required to estab- 
lish a steady-state [136]. This may seem a bit puzzling, because the Hubble 
expansion used in deriving equation (71) should not affect the approach to 
equilibrium. But note that the redshift continually brings entirely new pho- 
tons into the line. Equilibrium must be re-established for these photons, so 
the two calculations are actually identical. 

As described in §2.4, higher Lyman-series photons also scatter resonantly. 
They rapidly cascade to lower levels, so the radiation field around these lines 
never assumes a distribution like Lya. Thus the energy exchange is more 
subtle. Because the medium is so optically thick at these resonances, the few 
scatterings per photon that do occur take place at frequencies well blueward 
of resonance. In that case the photon is generally re-emitted closer to line 
center; the remaining energy - much larger than that provided through recoil 
- is deposited in the gas. This can increase the heating efficiency above what 
recoil would predict, but it is still much smaller than the Lya heating [140]. 

The energy exchange through this frequency drift is maximal when scattering 
occurs near the line center, which requires the line to have a relatively small 
optical depth. 14 As a result, scattering across the Ly/5 deuterium resonance 
(which has tgp > 2, and which also destroys photons and so produces an asym- 
metric background) is the most important of all the upper level transitions 
(including those of hydrogen itself) [178]. However, this energy is deposited in 
the rare deuterium atoms and must then be shared with the hydrogen atoms 
through collisions, which are relatively inefficient; the resulting heating rate is 
usually (though not always) slower than direct Lja heating. 



3.2.3 Shock Heating 

The final heating mechanism is purely hydrodynamic: thermal energy injec- 
tion through IGM shocks. This is an integral part of our modern view of 
structure formation. The initially tiny density fluctuations seen in the CMB 
grow through gravitational instability. Because these fluctuations are not per- 
fectly spherical, different directions grow at different rates [70]. Collapse along 
the shortest axis produces sheets (or "Zel'dovich pancakes," part of the origi- 
nal motivation for 21 cm experiments in the 1970s; [51,71]), collapse along the 

14 In the Dopplcr core, photons are preferentially absorbed by atoms moving oppo- 
site the photon; the isotropic re-emission imparts ~ hAv£> to the gas. In the high 
optical depth case, absorption occurs in the Lorentz wings of the line, where the 
direction of the absorbing atom is much less significant. 
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middle filaments, and along the third virialized halos. At each stage in this 
process, some of the gravitational infall energy is transformed into thermal 
energy through shocks, creating complex networks bounding and penetrating 
the sheets and filaments [179,180]. This paradigm of the "cosmic web" emerges 
naturally in cosmological simulations and nicely explains both the large-scale 
distribution of galaxies in redshift surveys and the distribution of IGM gas in 
the Lya forest. 

Simulations show that such shocks dominate the thermal energy budget of 
the IGM at z ~ [181,182]. Fortunately, their characteristic properties are 
easy to estimate. Each cosmic web shock is part of the ongoing (but recent) 
collapse of a nonlinear structure. Thus the peculiar velocity of the shocked gas 
must be [181] 

v sh « H(z)R nl (z), (76) 



where R\ x = 3M n i/(47rp) and M n \ is the mass scale that is just entering the 
nonlinear regime: in other words, its velocity is simply the distance it has col- 
lapsed (i? n i) divided by the age of the universe. The corresponding postshock 
temperature is then (assuming strong shocks and a monatomic gas) 

Tsh = T6ir^' (77) 



where \i is the mean molecular weight in atomic units. This simple argument 
reproduces the typical shock temperature T sh ~ 10 7 K at the present day [181] 
and also quantifies their importance at higher redshifts. 

Of course, a shock will only form if Tk < T s h (which depends on the typical 
mass of a collapsing object M nl ). At the present day, photoheating sets T K m 
10 4 K, and infall shocks are quite strong. At moderate redshifts z > 3, the 
typical nonlinear object is much smaller while Tk (again set by photoheating) 
is about the same. Thus T s h ~ Tk and shock-heating is insignificant [183]. 

However, we have already seen that before reionization Tk can be quite small. 
Both analytic models [183] and simulations [184-186] predict that shock heat- 
ing significantly affects the thermal structure of the IGM at high redshifts. 
A simple analytic description of cosmic web shocks will suffice to build intu- 
ition [183,187,188]. The model associates these shocks with the "turnaround" 
of spherical perturbations, where the flow breaks off from the Hubble ex- 
pansion and begins to converge on itself; at this point, perturbations in the 
flow can easily induce shocks. Turnaround occurs at a linearized overdensity 
<5 s h = 1.06 - just when perturbations become nonlinear. So the model repro- 
duces equation (76) and hence the characteristic temperature of z = shocks. 
We will use it to estimate the mass fraction inside cosmic web shocks in an 
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analogous way to the collapse fraction [101,189]. The fraction of gas that has 
been shocked to a temperature T > T 7 is (cf. eq. 7) 



/sh(> T 7 ) = erfc 



(78) 



V2a(T 7 ) 



where T 7 corresponds to a mass m 7 4.3 x 10 5 M via equations (76) and 
(77). This yields / sh ~ (0.1%, 3%, 25%) at z = (30, 20, 10): like halos, the 
cosmic web accretes gas rapidly in this regime. Note that turnaround occurs 
at a physical overdensity 5 ta , = 5.55, so this gas is overdense and has relatively 
efficient collisional coupling. In the calculations shown below, we will therefore 
assume for simplicity that it all emits 21 cm radiation relative to the CMB. 

This model agrees reasonably well with simulations [174,185], but two caveats 
are necessary. First, we have ignored X-ray heating in equation (78), which 
will smooth out these shocks. Fortunately, when X-rays are important shocks 
do not affect the 21 cm signal anyway, because all the gas is hot. Second, in 
the absence of X-ray heating, the IGM sound speed is so low that even tiny 
peculiar velocities could potentially source shocks. In a cold IGM, such slow 
shocks could form even before turnaround and would therefore be missing 
from our model. If so, they could dramatically affect the predictions when 
Tk <C T 7 . There is some indirect evidence for such shocks [184], but they are 
difficult to resolve in simulations and have not yet been studied in detail. 

3.3 The Spin Temperature 

With the thermal evolution in hand, we now turn to the spin temperature. 
Recall from equation (67) that collisions are totally inefficient at z < 30 (ex- 
cept inside filaments, or if Xi is large enough for collisions with electrons to 
dominate; see §7.2), so we must rely on the Wouthuysen-Field effect. Of course 
(as for the X-ray background), we cannot yet predict the detailed evolution 
of J a , because it depends on the star formation history as well as any other 
radiation background (quasars, etc.). But we can make an educated guess by 
assuming that it traces the star formation rate, which is again proportional 
to the rate at which matter collapses into galaxies. We therefore write the 
comoving emissivity at frequency v as 



where n c h is the comoving number density of baryons and e^iy) is the number of 
photons produced in the frequency interval v>±dv/2 per baryon incorporated 



z{y,z) = f*n c b e b (v) 



d/coii 
dt 



(79) 
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into stars. Here we are only interested in photons between Lya and the Lyman- 
limit. Although real spectra are rather complicated, a useful quantity is the 
total number N a of photons per baryon in this interval. For low-metallicity 
Pop II stars and very massive Pop III stars, this is N a = 9690 and N a = 4800, 
respectively. More detailed spectra (and fits) appear in [190-192]. 

Although only Lya photons efficiently couple to Tg, higher Lyman-series pho- 
tons contribute by cascading to Lya (see §2.4). The average background at u a 
is 15 



^max 

n=2 

'-max (n) 



= / d/ U_l (80) 



n=2 



4tt #(z') 



where v' n is the frequency at redshift z' that redshifts into the Lyn resonance 
at redshift z, z mSuX (n) is the largest redshift from which a photon can redshift 
into the Lyn resonance, and f TCC (n) is the fraction of Lyn photons that actually 
cascade through Lya and induce strong coupling (see §2.4). The sum must be 
truncated at some large n max , but its precise value does not matter because 
the high-n lines are so closely spaced. Once we know J Q , we can compute the 
coupling coefficient x a from equation (40). 

Of course, processes other than star formation can also create a Lya back- 
ground. These include UV photons from quasars, recombinations in a partially 
ionized medium, and collisional excitation from X-rays [98]. In the latter, a 
fraction /x,coii ~ 1/3 of the energy is typically lost to excitations (see eq. 69); 
the fraction of this that eventually winds up in Lya photons is ~ 0.8 [129]. 
The coupling coefficient induced by these line photons is [127, 128] 

x *-™y~oms a f x (i^A^l^) fl±ff. (8 i) 



Here we have substituted the same emissivity as in equation (70); thus heating 
is accompanied by a small, though far from negligible, Lya background. This 
process is particularly important near star-forming galaxies, where most soft 
X-rays are absorbed [127-129]. Recombinations in a partially ionized medium 
can also produce a weak background [98]. 



Note that we neglect absorption by the higher Lyman series resonances of deu- 
terium here, which reduce the spectrum just blueward of the hydrogen resonances 
[139]. Some of the scattered photons are re-injected at the deuterium Lya line and 
then redshift into hydrogen Lya. 
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3.4 The Ionization History 



The next step is to compute the evolution of the ionization fraction x,. This, 
of course, has been the subject of extensive study in the past three decades 
(see, e.g., [46-49] for recent reviews). As such, we will not attempt to explore 
all of its aspects but only to describe its principal components. In §8 we will 
consider the spatial fluctuations in the ionization fraction that will be the 
main focus of most 21 cm measurements. 

The usual assumption (and the one we will make here) is that ionizing photons 
are produced inside of galaxies, so that their production rate can be associated 
with the star formation rate in a similar way to the Lja radiation background 
and our X-ray heating model (see eqs. 70 and 79). In the most basic approxi- 
mation, we simply assign a fixed average ionizing efficiency across all galaxies, 
so that 

Xi = C/coii/(l + n rec ), (82) 



where n rec is the mean number of recombinations per ionized hydrogen atom 
and the ionizing efficiency is 

C = AHcA/csciVion. (83) 



In this expression, / esc is the fraction of ionizing photons that escape their host 
galaxy into the IGM, iV ion is the mean number of ionizing photons produced 
per stellar baryon, and A^ e = 4/ (4 — 3Y p ) = 1.22, where Y p is the mass fraction 
of helium, is a correction factor to convert the number of ionizing photons per 
baryon in stars to the fraction of ionized hydrogen. 16 We give some fiducial 
estimates for these parameters in §3.4.2. 

A more sophisticated treatment includes both ionizing sources and recombi- 
nations, so 

~T~ = C( z )^jt- ~ aC(z, Xi)xi(z)n e (z), (84) 



where a is the recombination coefficient, C = (n 2 e ) / (n e ) 2 is the clumping 
factor, and n e is the average electron density in ionized regions. The precise 
definition of C is subtle but important: the electron density is averaged (by 
volume) over all regions penetrated by ionizing photons (thus excluding gas 

16 Here we have assumed that helium is singly ionized along with hydrogen, because 
their ionization potentials are relatively close. 
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outside of ionized bubbles and gas inside self-shielded clumps) . Equation (84) 
allows both the ionizing efficiency ( and the clumping factor C to depend 
on redshift - and hence implicitly on Xi or any other parameter of galaxy 
formation. 



3.4-1 Recombinations and the Clumping Factor 

Before considering (, we first discuss some subtleties of the sink term, which 
is more complicated than it looks. First of all, the recombination coefficient 
is uncertain by a factor of a few through both the gas temperature and an 
environmental factor that determines whether case-A or case-B is more appro- 
priate. 17 On the one hand, consider the case in which ionizations (and hence 
recombinations) are distributed uniformly throughout the IGM (the usual as- 
sumption in the literature e.g., [193,194]). Then case-B would be appropriate, 
with a B « 2.6 x 10~ 13 (T^/10 4 K)~ - 7 cm 3 s" 1 [195]. On the other hand, in 
the highly-ionized low-redshift universe, most recombinations actually take 
place in dense, partially neutral gas (so-called Lyman-limit systems) because 
high-energy photons can penetrate inside these high-column density systems. 
However, the ionizing photons produced after recombinations to the ground 
state usually lie near the Lyman-limit (where the mean free path is small) so 
are consumed inside the systems. Thus these photons would not help ionize 
the IGM, and case-A (with a A ~ 4.2 x 10" 13 [T^/10 4 K]~ a7 cm 3 s" 1 [195]) 
would be more appropriate [196]. Which of these regimes is more relevant de- 
pends on the details of small-scale clumping and radiative transfer and is not 
yet a solved problem. Additional uncertainty comes from the gas temperature, 
which depends on the ionizing spectrum, another large unknown [197]. 

Even more problematic is the clumping factor C(z). In principle, of course, 
this can be computed through numerical simulations. But that requires over- 
coming two difficult problems: (1) tracing the gas distribution with sufficient 
precision to resolve density fluctuations on the smallest scales and (2) cor- 
rectly tracing the topology of ionized and neutral gas - because the average 
must be performed only over the ionized gas. The first problem is obvious: 
even leaving aside the interstellar medium (ISM) of each galaxy (which is ac- 
counted for by / esc in eq. 83) the Jeans mass in the cold IGM is < 10 5 M Q . 
This allows the formation of a well-defined cosmic web, as well as "minihalos," 
dense gas clouds that virialize but cannot cool or form stars. But simulations 
of reionization must span ~ 100 Mpc boxes in order to adequately sample the 
large HII regions, requiring an enormous dynamic range. Thus, even in sim- 



Here "case-A" allows recombinations directly to the ground state while "case-B" 
does not; in the latter, ionizing photons produced from recombinations into the 
ground state are assumed to be immediately absorbed again, so that they do not 
affect the net ionization rate. 
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ulations, clumping is usually accounted for through a "subgrid" model built 
from semi-analytic techniques or smaller simulations [108,198-200]. Minihalos 
are particularly problematic. Fortunately, although early estimates predicted 
that they could increase the clumping factor by more than an order of magni- 
tude [201], more recent efforts show that they have a relatively modest effect, 
consuming only a few photons per baryon [202,203]. 

The second problem is perhaps more subtle: how do the sources and absorbers 
relate to each other, and how does ionization affect the small-scale clumping? 
For example, if low-density gas is ionized first, C < 1 throughout most of reion- 
ization [10], because all the dense gas would remain locked up in self-shielded 
systems. This is an attractive picture because (on small scales) it is certainly 
easier to ionize voids than filaments or dense blobs. On the other hand, on 
large scales the ionizing sources actually lie inside overdense regions (sheets 
and filaments), where the recombination rate is relatively high. Moreover, as 
the gas is ionized, the thermal pressure will increase and the dumpiness will 
decrease. While widely appreciated in the context of minihalos [199,201-203], 
the implications for the clumping of the more diffuse IGM have not been 
studied. 

All of these issues have been considered in the literature, but unfortunately 
there is no single unifying model that accounts for realistic small-scale struc- 
ture along with the topology of reionization and feedback. The two most 
common approaches include different physics. The first is to compute the 
clumping factor from high-resolution simulations, ignoring reionization and 
feedback. One must still be careful to exclude gas bound to ionizing sources 
(which are described by / csc rather than C; see [200]), or else the estimated 
clumping is much too large (this problem is why the widely-quoted C ~ 30 
value of [204] is much larger than more recent estimates). One recent estimate, 
from a 3.5k' 1 Mpc N-body simulation resolving the Jeans mass in the IGM, 
is well-fit by [205] 

C(z) = 27 '.466 exp(-0.114s + 0.0013282 2 ). (85) 

The other approach follows [10] and assumes that the lowest-density gas is ion- 
ized first; in this case recombinations are much less significant. (This model 
can be modified to include at least some aspects of source clustering, which 
increases the dumpiness by a factor of a few [22].) The truth lies somewhere 
between these extremes for two reasons: (1) the numerical estimate neglects 
photoheating during reionization, while the other model of [10] is based exclu- 
sively on post-reionization simulations with a warm IGM, and (2) the numeri- 
cal estimate averages over all gas and does not allow any to sit in self-shielded 
neutral systems, while assuming voids to be ionized first represents a partic- 
ularly optimistic case for identifying such systems. 
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3.4-2 The Ionizing Efficiency 



We now move on to the source term in equation (84). This has two parts: 
d/ co n/cb and the ionizing efficiency (. The collapse fraction for a given cosmol- 
ogy depends only on m min , the mass threshold for galaxy formation. The most 
common choice for m min corresponds to a virial temperature T vir = 10 4 K, 
the threshold at which hydrogen line cooling becomes efficient for primordial 
gas (see, e.g., [46]). Above this mass, cooling and fragmentation into stars is 
relatively straightforward. Other choices are, however, physically plausible in 
certain regimes. For example, H 2 cooling could allow star formation in much 
smaller halos, although it is relatively weak (limiting the expected star for- 
mation efficiency) and it is subject to a variety of feedback mechanisms (see, 
e.g., [48,206,207] and references therein). 

We can use local measurements of / esc , and N ion to guide our choices, 
though the extrapolation to high redshifts is always difficult. Efficiencies /* ~ 
10% are reasonable for the local Universe, but so little gas has collapsed by 
z = 6 that this does not directly constrain the high-redshift value. Appropriate 
values for Pop III stars are even more uncertain. To the extent that each 
halo can form only a single very massive ( > 100 M ) star that enriches the 
entire halo, /* ~ (Q m /Qb)M*/Mh < 10~ 3 , though larger values are often taken 
in the literature (implicitly assuming either inefficient metal dispersal or an 
extremely rapid starburst). 

The UV escape fraction is small in nearby star-forming galaxies (including 
the Milky Way), with many upper limits / C sc < 6% [208-212] and only a few 
positive detections (at comparable levels). An initial detection of ionizing flux 
from a stacked sample of blue (and hence atypical) Lyman-break galaxies at 
z ~ 3 implied / csc ps 10% [213], but more recent observations either place 
upper limits / csc < 5-10% [214-217] or claim detections at much lower levels, 
/ csc ~ 2% [218]. Note as well that / csc shows large variance between galaxies. 
Theoretically the problem is equally difficult, because it depends on the spatial 
distribution of hot stars and absorbing gas in the ISM. Some studies suggest 
that the escape fraction in high-redshift galaxies could be much higher than 
the detected values (generally because higher specific star formation rates al- 
low supernovae to blow transparent windows through the ISM) [219-221], but 
others predict that it will remain small [222] . Two special cases are worth not- 
ing. First, quasars may have large escape fractions because they concentrate 
all of their photons in one spatial location [222]. Second, the shock bounding 
the HII regions of very massive Pop III stars may evacuate all the gas from 
their host halos if T vir < 10 4 K. In that case, / csc ~ 1 [223]. 

iVion depends only on the stellar initial mass function and metallicity. Conve- 
nient approximations are N- lon ~ 4000 for Z = 0.05 Z Pop II stars with a 
Scalo IMF [46,224,225] and iV ion « 40, 000 for very massive Pop III stars [191]. 
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Note that the latter assumes that all Pop III stars are massive; metal-free 
stars with a normal Salpeter IMF are only ~ 1.6 times more efficient than 
their Pop II counterparts [226]. More detailed spectra for Pop II stars can be 
found in [190,227] and for Pop III stars in [191,226,228]. 



3.4-3 Feedback During Reionization 

The wide range of allowed values for m min , / esc , and A^ on suggests that 
these quantities might evolve in a non-trivial way, either over time or across 
the galaxy population. Such feedback effects are now considered crucial to 
reionization, so here we will briefly outline their effects on its global history. 
In particular, we will focus on how feedback prolongs reionization. We refer 
the interested reader to [48] for a more comprehensive discussion. 

Feedback mechanisms can be conveniently divided into "internal" and "exter- 
nal" flavors. The former includes processes that affect later generations of ob- 
jects within the same galaxy. One straightforward example is self-enrichment: 
once Pop III stars pollute their host halo with metals, subsequent star forma- 
tion will be Pop II. Supernova winds also have other internal feedback effects: 
most importantly, they can eject a large fraction of the interstellar gas into 
the IGM, shutting off star formation (if only temporarily). These winds are 
well-known in the local universe [229] and at moderate redshifts [230], and 
they may be even more important at high redshifts. This is easy to see if we 
compare the energy output in supernovae per baryon, Wsn = A-^sn^sn, to 
the binding energy per baryon of a virialized halo, E b ~ Gm h /(\ s r vir ): 

W SN (h ^sn A, 10 W m h \~ 2/3 

E b ~ 51 yO.l 0.01 Mo" 1 0.05 1 + z) \W M Q ) 1 ' 



Here r vir is the virial radius, = 10 51 -E 5 i erg is the energy per supernova, 
z/gN is the specific frequency of supernovae (i.e., the number of supernovae per 
unit mass of star formation), and we have assumed that the baryons lie at a 
typical distance A s r vir from the halo center (our fiducial value is appropriate 
for disks in normal galaxies [231]). Thus the supernova energy reservoir is 
probably comparable to the binding energy of the baryons in high-redshift 
dwarf galaxies, suggesting that they could have a significant dynamical effect. 
Because Ws^/E b oc , the feedback efficiency is a function of halo mass, 

and / osc , or any of our other parameters may also be functions of halo 
mass. Indeed, low-mass galaxies in the SDSS appear to have /* oc mj^ [232] 
that may be a result of supernova feedback [233]. 

Fortunately, internal feedback mechanisms are relatively easy to incorporate 
into reionization models. For the most part, we can mimic them by setting 
C = C( m h)- Because these processes are local, on a global level they probably 
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do not evolve rapidly with cosmic time and so will not introduce features 
into Xi(z). For instance, self-enrichment causes each individual galaxy's star 
formation mode to evolve on a short timescale, but the continuous formation 
of galaxies throughout the Universe prevents any sharp evolution in the global 
initial mass function (although of course it can prolong reionization relative 
to a constant efficiency scenario [193]). 

External feedback is much more difficult to incorporate into analytic models. 
This includes all those effects that galaxies can have on their neighbors. We 
consider two representative mechanisms here. 

Metal enrichment: As described above, supernova winds likely escape their 
host galaxies, spreading metals into the IGM and affecting subsequent gen- 
erations of galaxies. The effective transition between (very massive) Pop III 
and Pop II star formation is sudden, occurring at a critical metallicity Z t ~ 
10~ 3 - 5 Z in the gas phase [234,235] or at Z t ~ 1(T 5 Z if dust is included [236]. 
Because the transition could be accompanied by a large drop in iV ion , this may 
have had an enormous effect on the reionization history. 

The simplest prescription for chemical feedback is to track the mean cosmic 
metallicity Z and to switch the mode of star formation once Z > Z t . Be- 
cause this imposes a global drop in the ionizing emissivity, it led to several 
predictions of "double reionization" in which Xi(z) actually decreased over 
some finite time interval [193,194,237-239]. However, this approximation is 
not a good one, because metal enrichment (like reionization) is highly inho- 
mogeneous and must be modeled physically (rather than with an arbitrary 
prescription) [240,241]. On the one hand, internal feedback must be included: 
even before Z — Z t , ongoing accretion onto existing galaxies produces Pop II 
stars (assuming mixing is efficient). 

The other aspect is that newly-formed halos only produce Pop III stars if 
they collapse from pristine material. As galactic winds expand into the IGM, 
more and more halos form out of pre-enriched gas, eventually choking off 
the supply of Pop III stars. Wind enrichment has been studied extensively 
from a theoretical perspective [242-246], though firm predictions are even 
more difficult than for reionization because the physics is so complex. The 
crucial point, however, is obvious: winds expand relatively slowly compared to 
ionizing photons, broadening the transition from Pop III to Pop II [241,247]. 
For the same reason, it is difficult to arrange for complete enrichment to 
precede reionization. 

The condition for double reionization follows from equation (84) [241] 




(87) 
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where (' is the ionizing efficiency of the second generation of sources (in this 
case, Pop II stars). Near the end of reionization, (df co \\/dz ~ 1, so only if 
C 1 or (/(' ^> 1 can recombinations dominate. It is important to note 
that, although the mean dumpiness may be large, rapidly recombining re- 
gions contain only a small fraction of the total mass. Moreover, after reion- 
ization, equation (85) no longer applies because it neglects Jeans smoothing 
from photoheating - so C is likely reasonably small in the vast majority of 
the IGM. Of course, double reionization is still possible if (/(' is sufficiently 
large. However, in practice the slow evolution of the Pop III fraction (thanks 
to self-enrichment) smoothes the transition so that, if the ratio is large, reion- 
ization takes place over a long enough period that double reionization requires 
C m /C n ~ 200 [241]; while in principle possible, this certainly pushes parame- 
ters in uncomfortable directions. 

Photoheating: A second crucial feedback mechanism is photoionization itself, 
which heats the gas because the liberated electrons are typically left with ener- 
gies > 1 eV. The increased thermal pressure suppresses accretion onto small 
halos and hence decreases the rate of star formation. This is usually quantified 
by raising the Jeans mass in heated regions to a larger mass. Unfortunately, 
the equivalent virial temperature Th is not entirely clear; the canonical value 
of T h ~ 2 x 10 5 K [248-251] may be an overestimate for halos collapsing during 
or soon after reionization [252]. 

Photoheating feedback provides a "self-regulation" mechanism for reioniza- 
tion, because Xi itself (or at least the closely related "photoheated fraction" 
[241]) controls the transition. Thus photoheating can significantly extend the 
reionization epoch (by Az ~ 4 for the canonical Th) by causing a plateau when 
Xi ~ 0.5. On the other hand, self-regulation makes it quite difficult for recom- 
binations to dominate at any point in this kind of feedback mechanism [241]. 
Note as well that photoheating can only be important if small halos domi- 
nate the photon budget. If C, increases rapidly with mass, reionization is only 
slightly delayed. 

Thus the major effect of external feedback (in either manifestation) is to pro- 
long reionization, but it probably does not introduce sharp features into the 
history. Other mechanisms, such as the suppression of molecular hydrogen 
cooling by soft-UV photons, likely have much smaller effects on x~i(z) [253-257], 
though they may affect star formation in the first galaxies. Internal feedback 
affects the distribution of ionizing efficiencies within galaxies but also does 
not introduce strong features into the global ionization history. The 21 cm 
background and its fluctuations will likely be one of our premier tools for 
understanding the role these various processes play. 
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3.4-4 Exotic Reionization Scenarios 

To this point, we have implicitly assumed that stars (of one form or another) 
reionized the universe. While that is clearly the most natural explanation, 
other sources - such as quasars - may also have contributed. Because quasars 
form in collapsed halos, they would probably produce qualitatively similar 
Xi(z) to star formation, although many of the ionizations would be due to 
X-rays (which would affect the topology of reionization; see §7.2). However, 
extrapolating the known population of quasars to higher redshifts suggests 
that they are unimportant [6, 193], and the z = X-ray background prevents 
them from dominating the reionization era [258,259] (see §7.2). On the other 
hand, there are currently no direct observational constraints on faint, high- 
redshift AGN, so they may still be important. 

An even more exotic possibility is reionization from some sort of decaying 
or annihilating particle [171, 260-264]. In that case, the ionization history 
would be entirely uncorrelated with structure formation, and there is no a 
priori reason to expect sharp features. However, the CMB does constrain such 
models, because long-lifetime decay leads to a protracted ionization history 
that affects the CMB temperature and polarization anisotropies [171,263]. 
Several currently fashionable dark matter models do not produce significant 
ionization [264], though they can affect the 21 cm history [162,265]. 

3.5 The Global History 

We are now in a position to compute the evolution of 6Tb in some representa- 
tive structure formation models. These histories illustrate the basic features 
of the observable signal and set the stage for the next several sections. We 
will necessarily ignore many of the subtleties in constructing detailed reion- 
ization histories; we refer the interested reader to the existing literature for 
more examples [168,193,194,237-241,266-269]. 

3. 5. 1 Some Critical Points in the 21 cm Line History 

There are five "critical points" in the 21 cm history that divide the signal 
into several distinct epochs. The first is Zdcc, when Compton heating becomes 
inefficient and Tk < T 7 for the first time (eq. 66). This marks the earliest 
epoch for which 21 cm observations are possible even in principle. The second 
transition is when the density falls below 5 co n (see eq. 67), at which point 
Ts — > T 7 and the IGM signal vanishes. These two transitions are well-specified 
by atomic physics processes, at least in the absence of any exotic dark sector 
processes. They define the beginning and end of "dark ages" for the purposes 
of the 21 cm line. 
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The remaining transition points are determined by luminous sources, so their 
timing is much more uncertain. These are the redshift Zh at which the IGM 
is heated above T 7 , the redshift z c at which x a = 1 so that the Wouthuysen- 
Field mechanism couples T s and T K , and the redshift of reionization z r . Their 
relative timing sets the observability of the 21 cm signal, so it is useful to 
consider them in some simple parameterized models [270] . We first ask whether 
Lja coupling precedes the other two transitions. The net X-ray heat input AT C 
at z c is 



where A/ co u ~ / co ii is the effective collapse fraction appearing in the integrals 
of equation (80). Note that AT C is independent of /* because both the coupling 
and heating rates are proportional to the star formation rate. Interestingly, 
for our fiducial (Pop II) parameters z c precedes Zh (see also the simple models 
of [133,135,168]). This could create a significant absorption epoch whose prop- 
erties offer a meaningful probe of the first sources. For example, very massive 
Pop III stars have a smaller N a , and an early miniquasar population could 
completely eliminate the absorption epoch. 

A similar estimate of the ionization fraction x ijC at z c yields 



For Pop II stars, N ion /N a fa 0.4; thus even in the worst case of / csc = 1 and 
n rec = coupling would become efficient during the initial stages of reion- 
ization. However, very massive Pop III stars have much harder spectra, with 
N ion /N a fa 7. In principle, it is therefore possible for Pop III stars to reionize 
the universe before z c , although that would require extremely unusual param- 
eters (cf. [271]). Such histories cannot be ruled out at present, but we regard 
them as exceedingly unlikely. Histories with x ijC <C 1 are much more plausible, 
at least given our theoretical prejudices about high- redshift sources. 

Finally, we ask whether the IGM will appear in absorption or emission during 
reionization. Combining equations (70) and (82), we have 



for the heat input AT as a function of X{. Thus, provided fx ^ 1, the IGM 
will be much warmer than the CMB during the bulk of reionization. This 
is convenient in that 5T b becomes independent of T s when Ts T 7 , so it is 




(88) 




(89) 




(90) 
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Fig. 7. Global IGM histories for Pop II stars. The solid curves take our fiducial 
parameters without feedback. The dot-dashed curve takes fx = 0.2. The short- and 
long-dashed curves include strong photoheating feedback, (a): Thermal properties. 
(b): Ionized fraction, (c): Differential brightness temperature against the CMB. In 
this panel, the two dotted lines show ST^ without including shock heating. From 
[270]. 

easier to isolate the effects of the ionization field. Significant absorption during 
reionization becomes more plausible for very massive Pop III stars, because 
they have much larger ionizing efficiencies (although their remnants may also 
induce correspondingly large X-ray heating). 



3.5.2 Some Example Histories 

We will now use some representative models chosen from [270] to illustrate 
these qualitative features in a more concrete fashion (see also [133,135,168]). 
We begin with a fiducial set of Pop II parameters. We ignore feedback (of 
all kinds) and take m min to correspond to T vir = 10 4 K, /* = 0.1, / esc = 0.1, 
fx = 1, iV ion = 4000, and N a = 9690. (Thus ( = 40 for this model.) Figure 7a 
shows the resulting temperature history. The dotted curve is T 7 , the thin solid 
curve is Tk, and the thick solid curve is Tg. As expected from equation (88), 
in this case we do indeed find that z c > specifically, z c « 18 and Zh ~ 14. 
Clearly Lyct coupling is extremely efficient for normal stars. 

The solid curve in Figure 7b shows the corresponding ionization history, with 
the clumping factor computed following [10] (which assumes that the lowest 
density regions are ionized first). It increases smoothly and rapidly over a 
redshift interval of Az ~ 5, ending at z r ~ 7. That is of course purely a 
function of our choice for (, but other values do not strongly affect the width. 

Figure 7c shows the corresponding 21 cm brightness temperature decrement 
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Fig. 8. Global IGM histories for very massive Pop III stars. Panels are the same as 
in Fig. 7. The solid curve takes our fiducial Pop III parameters. The long-dashed 
lines take f esc = 1, the short-dashed lines take fx = 5, and the dot-dashed line 
(shown only in c) assumes f eS c = 1 and fx = 5. From [270]. 

5Tb relative to the CMB. Here we have also labeled the corresponding (ob- 
served) frequency v for convenience. The signal clearly has interesting struc- 
ture. At the highest frequencies, reionization causes a steady decline in the 
signal, with |d5T;,/dz/| ~ 1 mK MHz -1 . In this model, recombinations are rel- 
atively inefficient; the only way to significantly increase the gradient during 
reionization would be with some positive feedback mechanism. 

However, as illustrated by the dashed curves, it is relatively easy to slow 
reionization. These curves use two models for photoheating feedback [241] 
in which the minimum virial temperature for galaxy formation increases to 
Th = 2 x 10 5 K in photoheated regions, near the upper limit of theoretical 
expectations [252]. It slows the evolution when Xi > 0.5 (though it remains 
monotonic), decreasing |d<5T&/d^| by about a factor of two. The ionization 
history can be slowed by an even larger factor by also decreasing ( in heated 
regions, and in extreme circumstances this could even cause Xi to decrease 
slightly (see eq. 87). Unfortunately, such a "recombination epoch" would have 
spectral gradients no larger than reionization itself (though with opposite sign, 
of course). 

Figure 7c contains an even more striking feature at higher redshifts. At z ~ 30, 
the IGM is nearly invisible even though <C T 7 (see Fig. 6). However, as 
the first galaxies form, the Wouthuysen-Field effect drives T s — > T K . Because 
z c > Zh, this produces a relatively strong absorption signal (5Tb « —80 mK) 
over the range z ~ 21-14 (or v ~ 70-95 MHz). However, the IGM still heats 
up well before reionization begins in earnest, making 5Tb nearly independent 
of Ts throughout reionization. 
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Figure 8 shows similar histories for very massive Pop III stars. The solid 
curves take m min to correspond to T vir = 10 4 K, /* = 0.01, / csc = 0.1, fx = 1, 
N ion = 30, 000, and N a = 4800, yielding ( = 30. Although the thermal history 
is qualitatively similar to the Pop II case, it has z c ~ 13 and z h ~ 11. Thus the 
absorption epoch is somewhat narrower, and it is also weaker because Pop III 
stars produce relatively few Lja photons. As a result, T$ does not approach 
Tk until the IGM is already hot. Thus, if very massive Pop III stars dominate, 
the absorption epoch will be considerably weaker, with gradients about half as 
large as the Pop II case. Moreover, z h is relatively close to z r , so T s does not 
saturate until after reionization begins. It may therefore be somewhat difficult 
to separate Ts and X{ at the beginning of reionization. 

Obviously, measuring this background could offer strong constraints on high- 
redshift star formation. The other curves in Figures 7-8 illustrate the range of 
features we expect. However, they all share one crucial property: z c occurs long 
before reionization, so we can safely expect some signal from the high-redshift 
IGM. 



3.6 Observational Prospects 

One obvious application of the 21 cm signal is to measure this "all-sky" spec- 
trum STb(z) [98,184,272,273]. The wide range of histories shown in Figures 7-8 
illustrate how powerful such observations would be (see also [168]). The models 
we have considered imply gradients \dST b /du\ ~ 1 mK MHz -1 during reion- 
ization and possibly somewhat larger values during a preceding absorption 
epoch. 

Because this is an all-sky signal, single-dish measurements (even with a modest- 
sized telescope) can easily reach the required mK sensitivity. However, the 
much stronger synchrotron foregrounds (see §9 for a detailed discussion) nev- 
ertheless make such observations extremely difficult: they have T s k y > 200- 
2000 K over the relevant frequencies. The fundamental strategy for extracting 
the cosmological signal relies on the expected spectral smoothness of the fore- 
grounds (which primarily have power law synchrotron spectra), in contrast to 
the non-trivial structure of the 21 cm background. Nevertheless, these fore- 
grounds have |dT s k y /dz/| > 3 K MHz -1 , so extracting the high-redshift com- 
ponent will be a challenge that requires extremely accurate calibration over 
a wide frequency range [184, 272] and, most likely, sharp localized features in 
ST b (z) that can be distinguished from smoother foreground features. 

The limiting factors in these measurements are likely to be systematics [272]. 
Foregrounds vary across the sky, coupling to the frequency-dependent side- 
lobes (though the variation should be smooth and hence removable [274]). 
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One strategy to avoid confusion with features in the foreground is to use mul- 
tiple (wide) fields across which foregrounds are uncorrelated. Another prob- 
lem may be the recombination lines of hydrogen and other common elements, 
which give rise to spectral features throughout the relevant frequency range. 
Fortunately, the frequencies of these lines are known precisely, and their rel- 
ative strengths are related through simple physics. Observations seeking the 
detection of a global signal will necessarily be made with high spectral resolu- 
tion, both to separate the recombination lines and to prevent contamination 
from terrestrial radio frequency interference. 

The instrumental calibration required to detect these faint, ~ 10™ 5 inflections 
in the radio spectrum is an exceptionally difficult task. The usual strategy is 
to compare the sky spectrum with a calibration source that is known to be 
smooth. Two clear questions arise: first, is the comparison source truly smooth, 
and second, can the two signals be brought together in the receiving system 
through paths with identical (or at least calibrate-able) gains? One approach 
is to use internal resistive loads that provide thermal power in proportion 
to the load's temperature. In this case, the impedance matching is critical, 
and there must be ways to accurately assess matching (or reflections) in the 
signal paths from the antenna that observes the sky and from the internal 
load. Alternatively, one can search for sources in the sky that could provide 
"external loads" with smooth spectra; one possible source is the Moon [272], 
which emits thermally at these freqencies (although its brightness temperature 
is actually cooler than the sky temperature at the low-frequency end of the 
relevant range). The advantage of an external load is that the calibration signal 
enters the receiving system through the same path as the sky signal. However, 
the Moon is reflective (as well as emissive), so that a faint reflection of the 
Earth radio frequency interference is mirrored at the Moon, making that a 
likely contaminant. 

The precision measurement of the shape of the CMB spectrum by the FIRAS 
instrument on COBE [275] makes a sobering comparison. FIRAS compared 
the CMB spectrum from the sky with an onboard black body source to deter- 
mine precisely the CMB spectrum. It found agreement with a Planck curve 
with an rms deviation of five parts in 10 5 over the wavelength range 0.5 to 
5 mm. This was a space-based instrument, observing a thermal sky spectrum 
in comparison to a thermal load, in a wavelength range where radio interfer- 
ence is negligible. The highly-redshifted 21 cm spectrum will surely be much 
more difficult to measure. 

Fortunately, even if systematics do prove to be insurmountable, there are other 
ways to measure the evolution of the mean signal. For example, Compton 
scattering by a massive, low-redshift galaxy cluster shifts the spectrum in fre- 
quency space in a predictable way, so that differential measurements between 
the cluster and its surroundings could be made [276]. This avoids many of the 
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systematic difficulties, but it requires relatively high-resolution observations 
(and hence may not measure the truly "global" signal). Another problem is 
the association of strong radio galaxies with the cores of massive galaxy clus- 
ters, because the continuum flux density will be systematically greater than 
the surrounding reference fields. Slight errors in gain calibration as a func- 
tion of frequency ( "passband calibration" ) may leave residuals that mimic the 
expected global signal. The calibration error would likely be present for all 
clusters, although meticulous comparison with strong radio sources unassoci- 
ated with X-ray clusters may be helpful. A second possibility is to use the 
21 cm fluctuations to measure the mean background through their redshift- 
space anisotropies [277], although in practice this is beyond the capabilities 
of the first generation of interferometers [278]. Finally, if the ionized bubbles 
can be separated from the residual neutral gas (through imaging or one-point 
statistics [279], for example), their contrast and the abundance of each phase 
provides a straightforward measurement of 5T b . 



4 The Power Spectrum 

While the global 21 cm background contains a great deal of information about 
the mean evolution of the sources, each and every component discussed in §3 
also fluctuates significantly. For the density field this is obvious: the evolving 
cosmic web imprints growing density fluctuations on the matter distribution. 
For the other aspects, the discrete nature of the luminous sources gives rise to 
21 cm fluctuations. Ionized gas is organized into discrete HII regions (at least 
in the most plausible models), and the Lya background and X-ray heating will 
also be concentrated around galaxies. The single greatest advantage of the 21 
cm line is that it allows us to separate this fluctuating component both on the 
sky and in frequency (and hence cosmic time). Thus we can study the sources 
and their effects on the IGM in detail. It is the promise of these "tomographic" 
observations that makes the 21 cm line such a singularly attractive probe, and 
in the next several sections we will describe their expected forms in many 
different physical regimes. 

Observing the 21 cm fluctuations has one practical advantage as well. The 
difficulty of extracting the global evolution lies in its relatively slow evolution. 
On the small scales relevant to the fluctuations, the gradients increase dra- 
matically. At the edge of an HII region, for example, drops by ~ 20 mK 
essentially instantaneously. As a result, separating them from the smoothly 
varying astronomical foregrounds may be much easier. We will discuss this and 
other experimental issues in §9. Unfortunately, as we will also see, constructing 
detailed images will remain extremely difficult because of their extraordinary 
faintness; telescope noise is comparable to or exceeds the signal except on 
rather large scales. Thus a great deal of attention has recently focused on 
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using statistical quantities readily extractable from low signal-to-noise maps 
to constrain the IGM properties. This is motivated in part by the success 
of CMB measurements and galaxy surveys at constraining cosmological pa- 
rameters through the power spectrum. In our case, although any number of 
statistical quantities may be useful (especially during reionization, when the 
fluctuations are highly non-gaussian) , we will take the power spectrum as our 
primary example. In the next several sections we will describe how both imag- 
ing and statistics can teach us about the high-redshift Universe. The goal of 
this section is to develop the formalism necessary to compute the 21 cm power 
spectrum. Most of our discussion will be sufficiently general that it can easily 
be extended to other statistics of interest. 

We first define the fractional perturbation to the brightness temperature, 
£21 (x) = [ST b (x) — ST b ]/ST b , a zero-mean random field (here ST b is to be eval- 
uated at the observer). We will be interested in its Fourier transform 5 2 i (k). 
Its power spectrum is defined to be 

(Mkx) 5 21 (k 2 )) = (2vr) 3 Mki + k 2 )P 21 (k 1 ), (91) 

where 8d(x) is the Dirac delta function and the angular brackets denote an 
ensemble average. Power spectra for other random fields (such as the frac- 
tional overdensity 5, the ionized fraction, etc.), or cross-power spectra be- 
tween two different fields, can be defined in an analogous fashion. In gen- 
eral, a power spectrum -P(k) is the three-dimensional Fourier transform of 
the corresponding two-point function and thus parameterizes the correlations 
present in the appropriate field. We will often use the dimensionless version 
A 2 (k) = (/c 3 /27r 2 )P(k), which roughly quantifies the variance when the field 
is smoothed on the scale x = 2rr/k. 

As is obvious from equations (18) and (20), the brightness temperature de- 
pends on a number of input parameters. Expanding those equations to linear 
order in each of the perturbations, we can write 

52i = 05 b + (3J X + (3 a 5 a + [3 t St - 5 9v , (92) 

where each Si describes the fractional variation in a particular quantity: 5 b for 
the baryonic density, 5 a for the Lja coupling coefficient x a , S x for the neutral 
fraction (note that using the ionized fraction would cause a sign change), 5t 
for T K , and 5g v for the line-of-sight peculiar velocity gradient. The expansion 
coefficients $ are 

/? = ! + TT^f \i (93) 
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a; tot (l + x to t)' 

fl _ T 7 , 1 /en ding HH dln^\ 

PT T k -T, + x tot (l + x tot ) [ Xc d\nT K + Xc d\nT K ) ' W 

where x tot = x c + x a and we have split the collisional term into the dominant 
H-e~ and H-H components (a^ H and x™, respectively) where necessary. Here 
we have assumed T c = T K throughout; this is reasonable in most cases but, if 
not, the expressions become much more complicated. Note that, by linearity, 
the Fourier transform 821 can be written in a similar fashion. Each of these 
expressions has a simple physical interpretation. For /3, the first term describes 
the increased matter content and the second describes the increased collisional 
coupling efficiency in dense gas. For f3 X) the two terms describe direct fluctu- 
ations in the ionized fraction and the effects of the increased electron density 
on x c . (The latter is only important in partially ionized regions; 21 cm emis- 
sion is negligible in HII regions, of course.) f3 a simply measures the fractional 
contribution of the Wouthuysen-Field effect to the coupling. The first term in 
(3t parameterizes the speed at which the spin temperature responds to fluctu- 
ations in Tk, while the others include the explicit temperature dependence of 
the collision rates. The challenge of the next several sections will be to com- 
pute these perturbation fields in the appropriate physical regimes. Note that 
all of these terms, with the crucial exception of 5q v , are isotropic (see §4.1). 

Of course, the 21 cm background directly measures the baryonic density field 
5b (or even more precisely, the hydrogen density field). For most purposes, 
this is equivalent to the total matter density 5 and in the following we will set 
5b = 5 throughout. However, note that on small scales the finite pressure of 
the baryons introduces a cutoff absent from the dark matter [280]; in detail, 
galaxy formation processes and feedback can also work on the two separately. 

For context, Figure 9 shows how these expansion coefficients evolve in our 
fiducial Pop II structure formation model (see §3.5.2). The density coefficient 
(3 increases with time until z ~ 20 before abruptly falling to unity. At z > 20, 
collisions are only marginally important so the extra collisional coupling im- 
parted by an increased density has a relatively large effect; at lower redshifts, 
collisional coupling is negligible compared to the Wouthuysen-Field effect so 
the second term in equation (93) vanishes. f3 x behaves nearly identically, be- 
cause (outside of HII regions) the ionized fraction remains small. Fluctuations 
in the Lya background are only important over a limited redshift range (where 
x a ~ 1); at lower redshifts, all the gas is strongly coupled so fluctuations in 
the background are unimportant. The temperature coefficient has the most 
complicated dependence because it depends on the mix of Compton heat- 
ing and collisional coupling. Note that the apparent singularity occurs where 
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Fig. 9. Redshift dependence of perturbative expansion coefficients in the fiducial 
Pop II model of Fig. 7. We show (3 (solid curve), (3 X (dotted curve), (3 a (dot-dashed 
curve), and (3t (dashed curve). Note that the singularity in (3t at z = 17 is artificial 
in that it does not actually appear in the fluctuation amplitude. 

Tk = T 7 ; it is not physical because 5Tb also vanishes at the same point. At 
lower redshifts, T K 3> T 7 and the emission saturates, @t 0. 

By equation (91), the power spectrum clearly contains all possible terms of 
the form Ps t s ', some or all could be relevant in any given situation. Of course, 
in most instances the various 5i will be correlated in some way; statistical 21 
cm observations ideally hope to measure these separate quantities. We have 
already included some of the obvious correlations in equations (93)-(96), such 
as the variation of the collision rate with the ionized fraction. But we have 
left others implicit: for example, as we will see later, it is likely that overdense 
regions are ionized first. Or, if photoionization equilibrium applies, S x depends 
on the density field. A more subtle example is the relation of 8 a to the other 
quantities; as we saw in §2.3.3, it depends on the radiation spectrum and hence 
on density, neutral fraction, and temperature in addition to the background 
flux. We have left this implicit in the interests of simplicity (and it is often 
ignored in the literature; [130, 192]), but it should be included in detailed 
calculations. 

In all of these expansions, one must bear in mind that S x is of order unity if 
the ionization field is built from HII regions. In that case terms such as SS X are 
in fact first order and must be retained. This leads to non-trivial four-point 
terms in the power spectrum (e.g., [278]); in practice, these terms may need 
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to be computed along with the two-point correlations. 

Because of the analogy to the CMB, some of the 21 cm literature makes use 
of the angular power spectrum (e.g., [158,281]). This integrates out the line 
of sight information (which is automatically measured in any real experiment 
thanks to the frequency axis), so the three-dimensional version is generally 
preferred (especially at large angular scales, where the shape of the matter 
power spectrum is such that relatively small scale density fluctuations domi- 
nate [277]). But the angular power spectrum is still useful for some purposes 
- especially in correlations with the CMB or other two-dimensional fields on 
the sky [282-287] - so for convenience we will define it here as well. For the 
moment we neglect the velocity term. In that case 8 2 \ is isotropic, and the 
brightness temperature (relative to the CMB) of the sky at frequency v can 
be written 

5 21 (h,u) = J drR(r;r )5 21 (n,r), (97) 



where the integral is over the line of sight distance and R(r; ro) describes the 
frequency response of the experiment; it is typically sharply peaked around 
r , the conformal distance to the redshift of interest. We construct the angular 
power spectrum from the spherical harmonic expansion of 5 2 ±, conventionally 
written 

6 2 i(h, v) = J2ai m (u)Yi m (h). (98) 

Im 



Using the identity 

= Y,^ l 3i(kr)YC m (k)Y lm (n), (99) 



Im 



where ji(x) is the spherical Bessel function of order /, we find 



/(l rf ~ 
ai (k,u)YC m (k)5 21 (k,u), (100) 

ai(k,v)= J dr R(r;r )ji(kr). (101) 
The (dimensionless) cross-frequency angular power spectrum is defined by 

(flJimiO'lKamaO^)) = Si lh S mima C h (v u V 2 ) . (102) 
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Performing the integrals, 

-^-P2i(k)a l (k,u 1 )a l (k,u 2 ). (103) 



— 2 

In temperature units, equation (103) takes a prefactor ST b . 

To gain some intuition, consider two limiting forms of equation (103) for a pure 
power law input spectrum [158]. First suppose that the frequency response is 
a delta function (valid on angular scales much larger than the radial response, 
or small I); then 

PCf\v,v) K ffi (l/)A 2 i(f/ro)> Z5 r / r «l (104) 



where 5r characterizes the bandwidth of the observation. Thus on large angular 
scales, the angular fluctuations should simply trace the corresponding density 
fluctuations. (This is not, however, true for 21 cm fluctuations, because the 
power-law approximation is not valid [277].) Small scale modes, on the other 
hand, suffer a cancellation from the oscillatory Bessel functions, and 

~ oc STl^AKl/ro)^. ISr/r » 1 (105) 



Here the angular fluctuations are suppressed by a factor that equals the num- 
ber of wavelengths fitting across the band. Clearly, retaining the frequency 
dimension will avoid this cancellation, so direct measurements of P2\{k) are 
preferred. 



4-1 Redshift- Space Distortions 



In general, we expect fluctuations in density, ionization fraction, Lya flux, 
and temperature, to be statistically isotropic, because the physical processes 
responsible for them have no preferred direction [e.g., <5(k) = 5(h)}. 18 How- 
ever, there are two effects that do break the isotropy of 21 cm fluctuations, 
both well-known from previous studies of large scale structure. The first is 
that transverse and light of sight distances scale differently in non-Euclidean 

18 Actually, this assumption can break down on extremely large scales, because then 
the growth of structure with redshift becomes important. Fortunately, the 21 cm 
field only contains rapidly evolving features on such large scales at the tail end of 
reionization. The evolution is generally not important on the scales accessible to 
observations [278,288]. 
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spacetimes, which artificially distorts the appearance of any isotropic distribu- 
tion. This well-known Alcock-Paczyhski (AP) effect [289-292] depends on the 
underlying background cosmology, and we will discuss it in §4.2. 19 Second, 
peculiar velocity gradients introduce redshift space distortions. Bulk flows on 
large scales, and in particular infall onto massive structures, compress the sig- 
nal in redshift space (the "Kaiser" effect), enhancing the apparent clustering 
amplitude [273,277,281,293]. On small-scales, random motions in virialized 
regions create elongation in redshift space (the "finger of God" effect), reduc- 
ing the apparent clustering amplitude [294] . We will now discuss each of these 
effects in turn. 

The anisotropy in the power spectrum induced by the Kaiser effect could allow 
an interesting separation of astrophysical and cosmological contributions to 
the 21 cm fluctuations [277]. On large scales, where linear theory holds, the 
fractional perturbation in the radial peculiar velocity gradient dt> r /dr has a 
Fourier transform proportional to that of the density field, 5g v = —fi 2 f5, where 
fi is the cosine of the angle between the wavevector k and the line of sight 
direction and / = dhiD/dlna rs Q^(z) [293]. (Intuitively, the two powers of 
\i result from taking the line of sight components of the peculiar velocity and 
of its gradient.) Thus in Fourier space, brightness temperature fluctuations in 
Fourier space have the form [281]: 

5 21 =fi 2 f~5+~5 iso (106) 

where we have collected all the statistically isotropic terms in equation (92) 
into 5js . Neglecting "second-order" terms (see below) and setting / = 1 in the 
high-redshift limit, the total power spectrum can therefore be written as [277]: 

P 21 (k) = fi 4 Pss + 2fi 2 P 5isoS + P 5iso s iso . (107) 

Equation (107) has some interesting features. Even in the simple case where 
<5iso = 5 (i.e., there are no spin temperature or ionization fraction fluctuations), 
the velocity term boosts the spherically averaged power spectrum by a factor 
((1 + /i 2 ) 2 ) = 1.87: redshift space distortions are not a small effect. Further- 
more, because of the simple form of this polynomial, measuring the power at 
> 3 values of ji should allow one to determine P$s, Ps iso s, and Ps iso s iso for each 
k. In particular, we can isolate the contribution from density fluctuations P$$. 
This would not have been possible without peculiar velocity flows: comparison 
to equation (92) shows that, in the most general case, Ps iso s and Ps iso s iso contain 
several different power spectra, including those of the density, neutral fraction, 
and spin temperature as well as their cross power spectra [291]. Note that P$s 

19 For the moment, we shall assume that that the true underlying cosmology is 
known from, e.g., CMB studies; in that case the AP effect can be ignored. 
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is simply the baryonic matter power spectrum at redshift z; assuming that we 
have independent measurements of that, we can also determine the astrophys- 
ically interesting prefactor 5T b oc x H i(l — T 7 /T s ) from the velocity field. In 
particular, at late times, when X-ray heating and Lja coupling have driven 
Ts ^> T 7 , we can in principle extract the mean neutral fraction xm(z) [277]. 

Disentangling the other components will be more difficult, since there are sev- 
eral remaining power spectra to be determined from the two measured quanti- 
ties Ps iso s(k) and -P<5 iso <5 iso (^)- Anything we can learn from these components will 
require properly modeling the processes and the correlations between them. 
This will be simplest in regimes where one or more of the terms can be ne- 
glected. For example, during the earliest stages of reionization (when 8 X is 
negligible), one might be able to measure the power spectrum of spin temper- 
ature fluctuations as well as its correlations with density. At late times (when 
T s 3> T 7 and 5T b becomes independent of T s ), one might likewise ignore spin 
temperature fluctuations and measure the ionization fraction fluctuations P$ x 
and P xx . In both these cases, there is generally a mild degeneracy in parameter 
estimation which can likely be broken by making weak assumptions about the 
asymptotic behaviour of the power spectra. 

An additional difficulty comes from the correlations of "second-order" terms in 
the perturbation expansion, such as S5 X , that produce four-point terms in the 
power spectrum. As mentioned in the previous section, 5 X is not necessarily a 
small parameter, so these terms can be substantial. Unfortunately, they also 
produce terms with non-trivial \x dependence that depend on the particulars 
of reionization [278]. The presence of these terms make attempts to separate 
the fi n powers during reionization more difficult; the prospects are much better 
before 8 X becomes important. We will return to this question in §8.3.2. 

Note that the velocity term also affects the angular power spectrum; the ex- 
tra fi 2 terms alter the angular dependence of the signal and hence the l- 
distribution of the power spectrum [277,281]. 

On small scales, random motions in virialized regions wash out features in 
redshift space [295]. If we only study line-of-sight modes (which are the most 
immune to foreground contamination), this places an irreducible lower limit 
on the size of structures we can study. However, for all of the planned ex- 
periments, the limiting angular resolution, which dilutes the radial signal by 
mixing together neutral and ionized structures in the transverse direction, cor- 
responds to much larger scales than the "fingers of God." Thus these will not 
affect any observables. Even at much higher angular resolution, virial motions 
should not pose a significant obstacle to detecting ionized bubbles in the 21 
cm data so long as photoionized bubbles are significantly larger than the virial 
extent of halos (almost definitely a safe assumption; see §8). 



76 



An important caveat to recovering redshift space distortions (and the AP 
test below) is that it requires a high signal to noise measurement of the an- 
gular structure of the signal. Unfortunately, the noise is anisotropic: radio 
foregrounds have much more power across the sky than in the line of sight di- 
rection. (Indeed, this very feature is crucial to foreground removal algorithms; 
see §9.3.) Moreover, it is much easier to probe small physical scales in the 
frequency direction than across the angular dimensions. As a result, taking 
advantage of this "separation of powers" will likely require second generation 
experiments (see [278] and Fig. 32). 

4-2 Cosmological Tests 

The preceding discussion assumed that the correct underlying cosmological 
model was already known (from CMB measurements, for example). Using an 
incorrect cosmological model creates apparent errors in the scaling of angular 
sizes (which depend on the angular diameter distance Da) compared to line of 
sight sizes (which depend on the Hubble parameter), introducing an artificial 
anisotropy even in intrinsically isotropic distributions. This Alcock-Paczyhski 
(AP) effect [289] can be used to measure cosmological parameters, though 
doing so in practice has proven difficult. Galaxy redshift surveys generally do 
not extend to sufficiently high redshifts for it to be important (although they 
may become useful if supplemented by deep galaxy cluster surveys [296]). 
Quasar surveys (e.g., SDSS), on the other hand, do have sufficient redshift 
depth but tend to be too sparse for optimal parameter estimation [297]. In 
principle the AP effect can be measured through distortions of the Lya forest 
power spectrum [298], but the practical challenges have proven substantial, 
and the first results are only just arriving [299]. Because the 21 cm signal is 
all-sky and so does not suffer from sparseness problems, one might hope that 
it will allow a definitive detection of the AP effect [94,290,292]. 

In keeping with our former discussion we consider the AP effect in terms of 
the power spectrum [292], though a presentation in terms of the correlation 
function is equally illuminating [290]. The AP effect distorts the shape and 
normalization of the 21 cm power spectrum to [292]: 

P 21 (k) = fi 6 P^(k) + ^P^(k) + fi 2 P^(k) + P^o(k). (108) 

The P^4, P M 2, and P^o terms include the redshift space distortions due to 
peculiar velocity gradients (eq. 107), but here they are further modified by 
the AP effect. How can we disentangle the two? The key is the presence of 
the P,jfi{k) term, which is solely due to the AP effect. 20 It therefore allows a 

20 This is not precisely true, because lensing can also introduce such a term [285], 
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measurement of 



(1 + a) 



HD A (Assumed Cosmology) 



(109) 



iLD^ (True Cosmology) 



Conceptually, we constrain cosmological parameters by varying them until 
a — 0. Once the underlying cosmology is known, the AP contribution to 
the P M 4,P M 2, and P^o terms can be identified and removed, leaving only the 
intrinsic power spectrum that we have already discussed. Note that the k and 
z dependence of the AP effect are well known and can aid in separating it 
from noise and foreground contributions. 

In principle the peculiar velocity anisotropy is also sensitive to the background 
cosmological model: substituting the full expression for the linear peculiar 
velocity perturbation, P z _ sp ace oc (1 + //i 2 ) 2 -P r caispacc [293]. However, because 
the universe is so close to Einstein-de Sitter at high redshift, this dependence 
turns out to be negligible (with D oc a and / = 1). In contrast, the AP effect 
does remain sensitive to the background cosmology out to high redshifts [291]. 
This can be easily seen from the expression for angular diameter distance in 
a flat Universe: 



The dependence on cosmology comes mostly from the contribution to the 
integral at low redshifts, where Qa(z) > 0. Unfortunately, this also implies that 
the information content is essentially the same as that of the CMB, where the 
angular diameter distance to the surface of last scattering has been accurately 
measured using acoustic oscillations as a standard ruler [20,224,300]. Any 
additional information must come from the difference between Da(z) and 
H(z) at the redshift of the 21 cm survey and at recombination. A relative 
accuracy of better than ~ 5% in logif and D\ must be obtained for the 21 cm 
AP effect to yield improvements on the CMB [292]. Unfortunately, real- world 
challenges prevent any planned experiments (or even instruments an order of 
magnitude larger than the SKA) from reaching this level [278]. 

The 21 cm AP effect is best measured in the early stages of reionization, when 
the effect of ionized bubbles is negligible (these introduce substantial addi- 
tional power in 21 cm temperature fluctuations as well as non-trivial \i depen- 
dence; see §8.3.2). Once reionization is underway, the large scale anisotropy 
pattern is strongly dominated by the distribution of HII regions rather than 
the background cosmology [291]. A clever way of disentangling these effects 

although it is generally small. 




(110) 
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might still be possible (see [290] for some suggestions) but is likely to be 
difficult. 



4-3 Secondary Fluctuations 

Highly- redshifted 21 cm emission and absorption constitute a new source of 
diffuse background light. Because it emerges from the distant Universe, it is 
subject to all of the same radiative transfer effects as the CMB; these are 
collectively termed "secondary fluctuations" and modify the observed power 
spectrum (or specific resolved features in maps). Secondaries allow us to use 
the 21 cm background to study the lower-redshift universe, yielding similar 
information to analogous measurements with the CMB. However, the 21 cm 
background does have two advantages over the CMB. First, because it is a 
spectral line, the 21 cm background is not a single full-sky map but rather 
a series of them at many closely-spaced redshifts [301]. Second, it contains 
structure down to small physical scales (in principle the IGM Jeans mass), 
whereas the CMB cuts off at £ ~ 1200 because of Silk damping. 

These properties are particularly useful for gravitational lensing. When con- 
sidered as a single diffuse broadband background, the 21 cm signal is less useful 
than the CMB because it is difficult to unambiguously separate the lensing 
modifications from the intrinsic clustering of the signal [282]. Most of the 
strategies for extracting lensing information from the CMB rely on two prop- 
erties that the 21 cm background does not share. First, they assume that any 
observed non-gaussianities are induced by lensing; as we shall see, this will 
not be true for 21 cm emission during reionization. Second, the CMB has a 
well-defined acoustic peak structure from which power is transferred, whereas 
the 21 cm background is relatively featureless (though some structure does 
appear during reionization). 

Fortunately, the two advantages identified above still outweigh these difficul- 
ties. Conceptually, 21 cm maps at many different redshifts are all at sufficiently 
large distances that they experience nearly the same lensing potential. This 
induces a cross-correlation between source screens from which the projected 
mass distribution can be extracted [301]. The cross- correlation is visible both 
in the variance map (because lensing changes the physical scale subtended 
by each angular patch) and in the shear map. Typically, CMB lensing is de- 
scribed through a perturbative expansion of the deflection angle 59. Unfor- 
tunately, this does not work well for the 21 cm background because of the 
large temperature gradients on small scales, so new methods in which the de- 
flection field is built by progressively adding smaller and smaller wavelength 
modes in Fourier space are required [285]. Lensing typically modifies the power 
spectrum by ~ 1% [285,301]. 
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In practice, the cross-correlation between neighboring maps is noisy and may 
not be the best way to extract the lensing information. Another way to ap- 
proach it is through the aspherical perturbations induced by lensing, with the 
largest modifications to modes along the line of sight because they sample 
the smallest projected scales [285,286]. Like redshift space distortions, this 
introduces fj 2 and /x 4 components into the power spectrum. A quadratic esti- 
mator, generalized from those used for the CMB, can take advantage of this 
angular dependence to reconstruct the projected mass field from 21 cm obser- 
vations [286] . Because such estimators extract the most information from the 
smallest resolved scales, they require high angular resolution. As such, SKA- 
class instruments will be needed to probe lensing effectively. However, such an 
instrument could prove more powerful than the CMB at reconstructing the 
intervening matter distribution. 

One interesting (though difficult) application is to "delense" CMB polariza- 
tion maps [302]. CMB lensing creates both E- and B-mode (i.e., curl-free and 
divergence-free) polarization. The main interest in observing the much weaker 
B-mode polarization is that its primordial component is generated by gravi- 
tational waves and offers a sensitive probe of inflation [303,304]. However, the 
lensing contribution must first be identified and removed; lensing aliases power 
from E modes to the much weaker B modes. This can be done statistically with 
high-resolution, all-sky CMB maps, but 21 cm maps provide an alternative 
by independently measuring the projected mass distribution and hence the 
B-mode contamination. In that case, the limiting systematic is the mismatch 
in source redshifts (and hence lensing potential) between the 21 cm screens 
and the CMB. An experiment centered on z = 30 that is sensitive to £ < 5000 
could, together with a modest ground-based CMB polarization experiment, 
reach comparable limits to a cosmic-variance limited, all-sky CMB-only mea- 
surement after a year of (on-source) integration [302]. 

Another type of secondary is generated by Thomson scattering in the post- 
reionization IGM, which washes out some of the intrinsic fluctuations (just 
as with the CMB) [148]. But scattering also drives polarization anisotropies 
sourced by the (local) quadrupole fluctuations seen by the scatterers (again 
just as with the CMB [305,306]). However, the local quadrupole of the 21 
cm background may be much larger than that of the CMB, because fluctua- 
tions in the ionized fraction (i.e., HII regions) when Xi ~ 0.5 have large sizes 
( > 20 Mpc; see §8 below) and so contribute a substantial anisotropy. A simple 
model with randomly distributed bubbles of a single size and instantaneous 
reionization predicts peak rms fluctuations in the polarized brightness tem- 
perature of ~ 3-20 fiK (with the amplitude increasing as reionization moves 
to higher redshifts) on scales £ ~ 100-500 (depending on the typical bubble 
size) [148] . The cross-correlation with temperature is typically an order of mag- 
nitude larger and (neglecting systematics) may be detectable with SKA-class 
instruments. 
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A third type of secondary is the Sunyaev-Zel'dovich (SZ) effect familiar from 
the CMB [307], which describes the modification to the background spectrum 
from energy exchange between photons and electrons during Compton scatter- 
ing. For lines of sight through hot galaxy clusters at low redshifts, this modifies 
the sharp spectral features of the 21 cm background and introduces new, non- 
trivial features [276]. The difference between the spectra seen through the 
cluster and through its surroundings could allow one to measure the global 
21 cm background through a differential measurement free from many of the 
usual systematics (see §3.6). 



5 The Dark Ages 

As we will see in §11, a number of other techniques can constrain star forma- 
tion at z > 6 - and indeed many have already had some successes. However, 
thus far 21 cm observations are the only proposed means of probing the truly 
dark ages between recombination (z ~ 1000) and first light (z ~ 30). 21 By 
observing the era before the messy baryonic physics of galaxy formation ap- 
pears, this is potentially a gold mine of information about cosmological initial 
conditions [2]. It even has two major advantages over the CMB: (i) IGM fluctu- 
ations persist to much smaller mass scales, being unaffected by Silk damping, 
and (ii) cosmic variance is much smaller, because there are many more in- 
dependent modes in the full three-dimensional volume than in the (single) 
last-scattering surface. However, the enormous difficulty of observations at 
such low frequencies means that they probably lie many years in the future. 
The ideas described in this section should therefore be regarded only as hopes 
for the far future; all of the observatories now being planned will focus on the 
"low-redshift" regime z < 15. 

As discussed in §3, the IGM thermally decouples from the CMB at z ~ 200, 
cooling adiabatically so that T K < T 7 . However, until z ~ 30 it remains 
sufficiently dense that collisions drive T s — > T K . Since T s < T 7 , the IGM 
can be seen in 21 cm absorption against the CMB, with a peak signal at 
z ~ 80. Figure 6 shows the global spin temperature evolution. Because it 
depends only on simple, well-known physics (adiabatic cooling and Compton 
heating), this global temperature history is exact; any observed deviations will 
be an exciting signature of energy injection at these high redshifts (e.g., due to 
decaying particles [162,265]). Unfortunately, the observational difficulties are 
immense: they are identical to those involved in detecting the "all-sky" STb(z) 

21 It has been proposed that resonant scattering by neutral Li at z ~ 500 might cre- 
ate detectable CMB anisotropics which could also probe this era [308,309]. Unfor- 
tunately, recent calculations show that the Lyo background from residual hydrogen 
recombinations is sufficient to keep Li ionized until low redshift [310]. 
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during the reionization epoch (see §3.6), but at a much lower frequency when 
foregrounds are ~ 1000 times stronger (T s k y ~ 10 5 K at 15 MHz), requiring 
telescope calibration stable to one part in 10 7 over a wide frequency range. 
However, the fluctuating 21 cm signal - seeded by linear density fluctuations 
- might still be observable, providing a probe of the primordial matter power 
spectrum over a wide range of scales [2]. 

It is useful to briefly review the evolution of these density fluctuations, which 
serve as initial conditions for cosmological simulations and are calculated to 
high accuracy in Boltzmann code solvers [311,312]. Before recombination, 
perturbations in the baryons on sub-horizon scales are strongly suppressed by 
radiation pressure. After recombination, the coupling between baryons and 
photons decreases sharply, and the baryon temperature and density are deter- 
mined by the gravitational attraction of dark matter potential wells, in addi- 
tion to thermalization with the CMB through Compton scattering off residual 
free electrons. The linearized form of the thermal evolution equations (62) and 
(63) is 

d5 T _2d5 x % T 7 5 T 

dt 3 dt 1 + / He + Xi T K t 1 ' ^ ' 



Temperature perturbations are thus sourced by scale-dependent density per- 
turbations; this leads to a spatially variable sound speed that modifies the 
growth of baryonic density fluctuations by up to 30% at z = 100 and 10% at 
z = 20 [280,313]. Moreover, it requires that density and temperature pertur- 
bations be tracked separately. The baryonic density perturbations gradually 
approach those of the dark matter, while the temperature perturbations ap- 
proach those of an adiabatic gas at a somewhat later time (see below). The 
acoustic oscillations created as baryons fall into dark matter potential wells, 
along with the variable sound speed, imprint five distinct fluctuation modes 
on the baryons, each with an amplitude sensitive to the underlying cosmology. 
Because each mode evolves differently with redshift, four of these modes may 
be separable with 21 cm data (the fifth is already much smaller) [313]. By 
contrast, low-redshift galaxy surveys are only sensitive to the growing mode 
(which we have written as D(z) throughout). 

Although a complete calculation of Pziik) requires tracking the density and 
temperature modes independently, we can build some intuition by ignoring the 
extra scale dependence introduced by them. Thus we write 5t = g(z)6 [281]. 
(This is a reasonable approximation on the scales most accessible to obser- 
vations, especially at z < 100 when the growing mode dominates.) Equa- 
tion (111) becomes 

dg din 5 Xi/t 7 T 7 g 

- « (2/3 -g)-^-+ 1 + /hc + ^ jT {1 + z)H{z y ( 112 ) 
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Here the first term describes adiabatic heating, which tries to drive 5t — > 25/3; 
the second term (from Compton heating) counters this by trying to make the 
gas isothermal. Obviously, during this epoch the terms oc 5 X , 5 a vanish in 
equation (92). Thus we can write 

5 2 i~(/9' + /i 2 /)5 (113) 



where ft' = f3 + g{z)(3 T and we have used Sg v = —fi 2 f5. The crucial point is 
that, even in this simple approximation, fluctuations in the spin temperature 
(and hence 5Tb) do not evolve identically to density fluctuations, because of 
the extra redshift-dependent factor g(z). This term describes how the modified 
kinetic temperature in dense gas affects the collision rate (and hence the spin 
temperature). 

Generically, we expect g(z) to evolve from g = 0tog~2/3as the CMB energy 
density drops and Compton heating becomes less important. This must com- 
pete with the evolving density perturbations and background temperature 
to determine the overall growth of 21 cm fluctuations. We show the result- 
ing spherically-averaged rms 21 cm brightness temperature in Figure 10 (see 
also [2,281]). 22 For this illustrative fi gure, we have included only the growing 
density mode; we refer the reader to [280,313] for more detailed calculations. 
We present the amplitude at a fixed wavenumber k — 0.1 Mpc" 1 for conve- 
nience; this is chosen to be in the regime most accessible to observations. The 
solid curve shows the net fluctuation amplitude; the other curves show the 
various components that contribute to this signal. 

First, the dotted curve sets f3' — 1; this is equivalent to ignoring fluctuations 
in both the temperature and collision rate. Thus in this case the rms fluctu- 
ations simply trace ST^z) multiplied by the growth factor. The long-dashed 
curve sets (3t = 0, ignoring the temperature fluctuations entirely. This al- 
ways amplifies the 21 cm fluctuations because, for example, increasing the 
density increases the collision rate and hence drives T s closer to T K . But it 
overestimates the real fluctuations, because adiabatic compression in dense 
gas increases Tk as well. The short-dashed curve includes this effect as well 
but ignores the temperature dependence of collisional coupling (i.e., it sets 
Pt = T y /[T K — T 7 ]). Compared to the dotted curve, this decreases the fluctua- 
tion amplitude at z > 80 but increases it at lower redshifts. At high redshifts, 
collisions efficiently couple Tg to Tk everywhere; thus because Tk increases in 
overdense regions, so does Tg. At lower redshifts, on the other hand, collisions 
become inefficient and Tg — > T 7 . Overdense regions can thus more efficiently 
drive Tg — > Tk < T 7 , causing the spin temperature to fall inside density en- 
hancements and increasing the overall fluctuation amplitude. Finally, the last 

22 Note that the relative importance of temperature variations depends on angle 
because they must compete with the (temperature-independent) velocity term. 
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Fig. 10. Spherically-averaged fluctuations of the 21 cm brightness temperature rel- 
ative to the CMB at k = 0.1 Mpc -1 as a function of redshift. The solid curve 
shows the true net fluctuation amplitude. The dotted curve ignores collisions and 
temperature fluctuations entirely (i.e., (3' = 1). The short-dashed curve ignores 
the temperature dependence of the collision rate, while the long-dashed curve sets 
Pt = o. 

step (included in the solid curve) is to add the temperature dependence of the 
collision rates. This is unimportant at high redshifts but further boosts the 
fluctuations at lower redshifts, because is so steep at low temperatures 
(see Fig. 2). The true peak of the fluctuations occurs at z 55. 

One more subtlety relevant to this regime is that, at z < 100, collisional cou- 
pling is only marginally efficient. In this case, the hyperfme level populations 
depend on the atomic velocity (essentially because faster atoms collide more 
frequently), which modifies the signal by a few percent [111]. 

A great virtue of these observations is that they probe the power spectrum 
all the way to k < 10 3 Mpc^ 1 (above which Jeans smoothing effects become 
important). They can therefore directly constrain modifications of small scale 
power in the standard ACDM model made to account for galactic scale ob- 
servations, as well as the possible running of the spectral index suggested by 
WMAP [20]. By contrast, since Silk damping suppresses small-scale power 
in the CMB, that field can only probe k < 0.2 Mpc -1 . These small scales 
are likewise inaccessible to Lya forest observations, because the Jeans mass 
increases by a large factor after reionization. Thus the 21 cm line is a truly 
unique cosmological probe. In addition to Jeans smoothing, the 21 cm ab- 
sorption line is also subject to thermal smoothing along the line of sight. In 
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principle, this could be separated out via its angular dependence, allowing a 
direct measure of the temperature Tk(z) and Hubble parameter H(z) [280]. 
In practice, this requires extremely high frequency resolution (e.g., ~ 25Hz at 
z = 50, compared to the redshifted 21 cm frequency of 28 MHz). 

Another great virtue of 21 cm observations is the huge number of independent 
samples contained in the fully three-dimensional volume [2]. The observable 
Universe has a volume ~ 3000 Gpc 3 between z = 30 and z = 100, correspond- 
ing to ~ 6 x 10 11 patches with radii equal to 1 Mpc or > 10 18 independent 
Jeans masses. This huge number of samples could potentially be a sensitive 
probe of primordial non-gaussianity, although of course instrumental system- 
atics will be the real limiting factor. 

The main damper on all these exciting possibilities is the extraordinary dif- 
ficulty of the observations. The strong foregrounds at these low frequencies 
have brightness temperatures two orders of magnitude larger than those of 
z ~ 10 observations, which as we will see pose a rather difficult challenge them- 
selves. Furthermore, the earth's ionosphere becomes opaque at v < 20 MHz 
(or z > 70), and one must go to space. "Dark age" experiments lie many years 
in the future. 



6 The First Structures 

By z ~ 30, collisions had become so rare that T s ~ T 7 , and the 21 cm 
fluctuations are no longer visible. However, at just about this time the first 
collapsed objects began to form. These, and the surrounding networks of sheets 
and filaments, produced hot, overdense gas where collisional coupling became 
efficient again. Thus the next phase in the 21 cm history is the birth of the first 
nonlinear structures; in this section, we will describe potential observations of 
this epoch. 

6. 1 Minihalos 

As we saw in §3.1, the IGM temperature is extremely small in the absence 
of luminous sources. Thus the Jeans mass is quite small (~ 10 5 M Q [314]), 
and tiny objects can collapse. However, atomic line cooling in primordial gas 
requires virial temperatures T vir > 10 4 K (or M > 10 s M ). Thus over a rel- 
atively wide range in mass, halos are probably unable to cool and form stars 
through the usual channels. Such objects are known as "minihalos," and they 
can evolve along two possible routes. First, if H 2 is able to form, the gas can 
cool through its vibrational transitions. It is via this process that the first mini- 
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halos are believed to cool (over relatively long time intervals) and probably 
produce the first stars [315-317]. 

However, once these first stars form, subsequent generations of minihalos no 
longer form in isolation. In particular, these stars flood the Universe with UV 
photons. As well as initiating Wouthuysen-Field coupling, photons in the en- 
ergy range 11.18-13.6 eV (the so-called Lyman- Werner band) dissociate H2 
and thus suppress cooling inside minihalos (see [46] for a review). As a re- 
sult, minihalos virialize but do not fragment and form stars. Instead they 
become dense (<5 m h ~ 200) gas clouds that float serenely through the IGM. 
During reionization, minihalos become photon sinks and contribute to the 
gas dumpiness; thus their properties are important to measure. Fortunately, 
they also become 21 cm sources [318]. Given their virial temperatures, 5 m h 
is much larger than the critical value of equation (67), so collisional coupling 
becomes extremely efficient. Unfortunately, the typical minihalo is < 10 kpc 
across, much smaller than the resolution limit (at any reasonable sensitivity) 
of any telescope for the foreseeable future. Thus minihalos must be detected 
statistically [318,319]. The net signal depends on the fraction of baryons / m h 
incorporated into minihalos and their average bias 6 m h- 

Figure 11 shows some example power spectra built from the halo model [102, 
320] . The dot-dashed curves show the rms brightness temperature fluctuations 
for a universe in which the entire IGM has T s 3> T 7 ; the thick and thin curves 
are for z = 20 and z = 10, respectively, assuming linear theory. The uppermost 
solid and dashed curves show the rms fluctuations from minihalos at these two 
redshifts. We see that the minihalo signal is ~ 1 mK at z = 10, and nearly 
an order of magnitude smaller at z — 20. On large scales, the net minihalo 
fluctuations are [320] 

6T b A mh (k) » W mh 5T 6 hot A lin (A;), (114) 

where <5T b hot is the mean brightness temperature if T5 ^> T 7 and An n is the lin- 
ear density power spectrum. This shape only changes on scales k > 10 Mpc -1 , 
where individual minihalos are resolved and the power spectrum instead traces 
their density profiles. Note, however, that the halo model does not include 
nonlinear clustering, which modifies P(k) on somewhat larger scales (see be- 
low) [319]. 

Of course, all the elements of equation (114) depend on the underlying cos- 
mological model, and in principle minihalos provide a probe of its parameters 
(especially the small scale matter power spectrum) [318]. However, as is all 
too common in astronomy, there are unavoidable - and large - degeneracies 
with the astrophysics. Most importantly, as we have seen in §3.2.1, luminous 
sources produce X-rays, which have two effects on minihalos. First, they pen- 
etrate minihalos and increase the free electron fraction, catalyzing H 2 forma- 
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Fig. 11. Expected rms brightness temperature fluctuations from minihalos. Thick 
solid and thin dashed curves are for z = 10 and 20, respectively. From top to bottom 
within each set, the curves assume Tk = T^, 20, 100, and 1000 K with J a = 0. The 
dot-dashed curves show the fluctuations from linear theory, assuming that the entire 
IGM has T s > T 7 . From [320]. 

tion. This promotes cooling and could allow minihalos to fragment and form 
stars [167,174,255]. Unfortunately, the net effects of UV and X-ray feedback 
remain unsettled. 

But a more important aspect may be X-ray heating of the IGM, which sup- 
presses minihalo formation by increasing the Jeans mass and preventing smaller 
minihalos from accreting gas [321]. The effects are most easily expressed 
through the "entropy" K iGM = T K /n 2 ^ 3 injected into the IGM. The entropy is 
conserved during adiabatic expansion and contraction, so the effects of feed- 
back can be seen by comparing Kjq M to the entropy K mh = T vir /n(r vir ) 2 / 3 
generated during halo formation. If Ki GM ^> K mh , the thermal pressure gen- 
erated during collapse exceeds the gravitational potential, and that minihalo 
is unable to accrete gas [321]. The lower sets of curves in Figure 11 show 
the minihalo signal if Tk = 20, 100, and 1000 K, assuming that the energy 
was injected at the cosmic mean density [320]. X-rays can decrease the signal 
by more than two orders of magnitude - much larger than any cosmologi- 
cal uncertainty. If the minihalo fluctuations can be seen, they will constrain 
astrophysical feedback rather than cosmological parameters. 

Unfortunately, separating the minihalo signal from the IGM will be difficult, 
because the Wouthuysen-Field effect renders the entire IGM visible against 
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Fig. 12. Minihalo signal in a cosmological context, (a): The mean ionized fraction Xi, 
the collapsed fraction in star- forming halos /halo? the minihalo fraction / m h, and the 
bias-weighted minihalo fraction. For / m h, the upper and lower curves respectively 
neglect and include X-ray heating, (b): Temperature history, (c): Large-scale 21 
cm fluctuation amplitude, (d): Mean brightness temperature relative to the CMB. 
From [320]. 

the CMB at an early stage of structure formation. Because the IGM contains 
nearly all the mass, it will outshine the minihalo component by a large fac- 
tor [274]. Figure 12 illustrates how the minihalo signal evolves in our fiducial 
Pop II model of §3.5, except that we have set fx = 0.2 to minimize the im- 
portance of X-ray heating. Panel (a) shows Xi together with the mass fraction 
in minihalos. Note that (because the effects of X-ray heating are controver- 
sial) we show two cases for / m h, one ignoring X-ray heating and one assuming 
suppression through the entropy floor [321]. 

Figure 12c shows the large-scale fluctuation amplitude for the IGM, minihalo, 
and total signals. Minihalos are the only objects to shine brightly at z > 22, 
before Lya coupling begins. Unfortunately, at such high redshifts the minihalo 
component contains only a negligibly small fraction of the mass, so it is still 
overwhelmed by the weak IGM fluctuations. At z > 14, the IGM is still cold 
(but visible), so the entropy floor is relatively unimportant. In this regime, 
(hot) minihalos and the (cold) IGM actually cancel each other out in terms 
of the mean signal 5T&. Thus the minihalos do have observable effects - but 
because they trace the linear matter power spectrum on large scales (just like 
the IGM), they remain difficult to separate. In particular, the subtle change in 
STf, they cause can easily be mimicked by a slightly different thermal history. 
By z < 15, the minihalo phase does contain a substantial fraction of the mass, 
but the IGM has become hot. In this regime, minihalos have identical emission 
properties to the diffuse IGM and - even in principle - cannot be distinguished 
from the IGM short of resolving them. Note as well that this conclusion is 
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independent of the efficacy of X-ray feedback. 

Thus our best hope of observing minihalos is to resolve them, or at least their 
nonlinear clustering patterns [319]. The dynamics of small-scale clustering 
modify the shape of the power spectrum at k > 3 Mpc -1 and so make it 
distinguishable from the linear power spectrum of the IGM. This effect is not 
included in the halo model. Unfortunately, these scales are beyond the reach 
of any planned experiment, but if they can be measured they will provide 
valuable insight into the "sinks" of reionization. A better way to probe such 
small physical scales may be with the "21 cm forest" (see §10). 



6.2 The IGM 



Perhaps the best hope to observe the first non-linear structures as they form 
is before Wouthuysen-Field coupling becomes efficient. But, of course, even in 
that case minihalos must compete with any other structures visible against 
the CMB. In particular, equation (67) shows that, even at z ~ 20, collisional 
coupling does not necessarily require virial overdensities if the gas is moder- 
ately warm. Thus we would expect to see the cosmic web visible against the 
CMB even during the earliest phases of structure formation. Although this is 
a difficult regime to study, it has been considered both analytically [183] and 
in numerical simulations [185,186]. 

The analytic model of cosmic web shocks described in §3.2.3 provides some 
intuition for the IGM signal. It assumed that cosmic web shocks form at lin- 
earized overdensities characteristic of "turnaround" from the IGM Hubble flow 
(5 ta = 1-06). The crucial point is that this density threshold is well below viri- 
alization (5 C = 1.69), so filaments form before halos. The cosmic web contains 
a fraction ~ (0.1%, 3%, 25%) of the gas at z — 30, 20, and 10 (compare to 
/ m h in Fig. 12 b) so could boost the emission significantly. Unfortunately, the 
emission strength also depends on the structure of the shocked gas, which 
the analytic model cannot accurately describe. It is therefore best to turn to 
numerical simulations. 

High-resolution simulations can compute the detailed temperature and density 
distributions of IGM gas (including shocks). The left-hand panels of Figure 13 
(taken from [186]) illustrate the resulting signals. As the cosmic web forms, 
gas collapses onto sheets and filaments. Because the IGM has T K = 7.2 K < 
T 7 = 50.5 K at z = 17.5, the initial contraction leaves the gas overdense but 
still cool. Collisional coupling is somewhat more efficient in this gas, and the 
filaments are characterized by weak absorption. As it continues to collapse, the 
gas shocks, heating it well above T 7 . In these central regions, collisions are quite 
efficient, so filaments emit relative to the CMB. However, the total mass locked 
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Fig. 13. Projected mass-weighted spin temperature (upper panels) and 5Tb (lower 
panels) from a 0.5 Mpc simulation box at z = 17.5. The left panels assume no 
radiation background, while the right panels contain a miniquasar (located at the 
cross) that emits X-rays (but does not induce Wouthuysen-Field coupling). Note 
that T 7 = 50.5 K at this redshift. From [186]. 

up in this phase is still small: only ~ 1.7% of the box has 5Tb < —10 mK, with 
a comparable amount observable in emission. This kind of signal therefore lies 
well beyond the reach of any instruments on the horizon today. 

However, as the analytic model shows, the fraction of gas in the cosmic web 
increases rapidly at lower redshifts, because all of these structures are far off on 
the exponential tail of the mass function. This is seen in much more detail in 
simulations that continue to lower redshifts [185] . These also show a fluctuating 
IGM appearing in both emission and absorption. As the mean cosmic density 
decreases, the absorption phase weakens because overdensities characteristic of 
shocks are required for collisional coupling. These simulations also compared 
the IGM emission to that of minihalos [185]. By z ~ 9, the nonlinear mass 
scale is sufficiently large that f m u > 10% (see Fig. 126). At this point, the 
IGM emission is about half that of minihalos; at higher redshifts the IGM is 
more important (although its absorbing and emitting components do tend to 
cancel each other out). 

Of course, once a Lya background (or even a strong X-ray background) is 
in place, all of the IGM lights up in the 21 cm line. Because this occurs 
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quite early in the most plausible models, the collisionally coupled phase will 
probably always be difficult to observe. The IGM will boost the emission - 
especially at higher redshifts, when both filaments and minihalos are far above 
the nonlinear mass scale [183] - but not by a large enough factor to render it 
easily observable. 



7 The First Luminous Objects 

Once the first sources of light appear, the character of the 21 cm sky changes 
completely. These objects change the spin temperature (through Wouthuysen- 
Field coupling), the kinetic temperature (through X-rays), and of course the 
ionized fraction. We studied the mean evolution of these quantities in §3, but 
of course all of them also introduce fluctuations in the 21 cm signal from which 
we can learn about the first sources. We have already seen that heating and 
Wouthuysen-Field coupling typically precede significant ionization. To begin, 
we will therefore ignore HII regions and focus on fluctuations induced by the 
Lja background and X-rays. 

7.1 Fluctuations from the Wouthuysen-Field Effect 

If star-forming galaxies dominate the UV radiation field, the average Ly« 
background is given by equation (80): it is simply the sum of photon fluxes 
redshifting into each Lyn transition, weighted by the appropriate f TCC {n). At 
first sight, it might seem that this background would be extremely uniform 
because the volume sampled by photons that redshift directly into the Lja 
transition has a radius ~ 250 Mpc. However, in reality several factors combine 
to render the fluctuations significant [192]: (i) the flux is weighted by r~ 2 
and is hence more sensitive to nearby sources; (ii) the higher Lyman series 
transitions are more closely spaced in frequency and so have much smaller 
effective horizons; (iii) the first sources of light are highly clustered; and (iv) 
the finite speed of light implies that more distant sources are sampled earlier 
in their evolution (when they were presumably less luminous). As a result, 
the UV background is bright near clumps of high-redshift sources and faint 
elsewhere. Observations of this patchwork of emission or absorption would 
measure when these sources first appeared, their clustering, and their spectra. 

The resulting fluctuation power spectrum has been calculated in the limiting 
case of Xi = and uniform (or zero) X-ray heating [130,192] and in a more gen- 
eral case including non-uniform X-ray heating [129]. We will first focus on the 
former. The fluctuations have two parts. The first traces the underlying density 
field, in a similar manner to the ionization model we will describe below: large- 
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scale overdensities contain extra sources and hence have large x a . The resulting 
power spectra have the form [P aa (k), Ps a (k)] = [W 2 (k), W(k)] P S s(k). Here 
P aa is the power spectrum of fluctuations in x a , P$ a is its cross-correlation 
with the density field, and W(k) is a scale-dependent weighting factor. On 
large scales, W(k) approaches the average linear bias of the sources, as should 
be expected for a background produced by a population of discrete halos [192]. 
On scales much smaller than the effective horizon of Lyn photons, the power 
vanishes since the radiation field is smooth beneath these scales. On interme- 
diate scales, the horizons of the various Lyn transition modulate the effective 
bias [130]. The stochastic distribution of galaxies provides a second set of fluc- 
tuations [322,323]: two nearby points sample nearly the same distribution of 
sources, inducing strong intrinsic correlations in the flux field. As with any 
Poisson process, the fractional fluctuations are proportional to iV -1 / 2 , where 
N is the effective number of sources inside the sampled volume. Thus this 
term is important on scales comparable to the mean separation of sources. 
Because these two components relate to the density field (which sources the 
/^-dependent velocity term) in different ways, they are in principle separable 
using redshift space distortions. 

Figure 14 shows some example power spectra at z = 20 calculated following 
[130]. The solid curves in the upper panels show the /i 2 component of the 
brightness temperature power spectrum [192], 

P^k) = 2\J3 + g(z)Pr + (3 a W(k)]P ss (k), (115) 

which includes both the density fluctuations (dotted curves, shown for x a = 1) 
and those induced by Wouthuysen-Field coupling. The lower panels show the 
Poisson component (uncorrelated with the density field) . The two panels span 
the range of plausible source parameters: the right plot assumes that Pop II 
stars form in halos with T vir > 10 4 K, while the left plot assumes that Pop 
III stars form in all halos with T vir > 500 K. In both cases, we assume a 
constant star formation efficiency in all halos above threshold, normalized so 
that x a = 0.25, 1, and 10 (bottom to top). 

Clearly the fluctuation amplitude depends on the source properties: the signal 
increases for halos that are rarer and more biased [192]. The boost is much 
larger - over an order of magnitude between the two cases - for the Poisson 
fluctuations, because those depend directly on the source density. The stellar 
spectral shapes matter only on intermediate scales where the distribution of 
photons between the different Lyn transitions has a weak effect on the k- 
dependence. On large scales, the fluctuations are several times greater than the 
density fluctuations (because of the source bias), but the amplification vanishes 
at smaller scales. In general, on observable scales the Poisson fluctuations are 
much weaker than the density-induced component because the corresponding 
volumes are rather large. 
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Fig. 14. 21 cm brightness temperature fluctuations from Wouthuysen-Field coupling 
[129]. Upper panels: The fj? component of P2i(k). Lower panels: Power sourced by 
stochastic variations in the galaxy density. In each panel, we assume z = 20 and 
Tk = r a( j. The left and right plots assume Pop III (Pop II) stars form in halos 
with T vir > 500 K (10 4 K). We show x a = 0.25, 1, and 10 (solid curves, bottom 
to top). The dotted curves in the upper panels show the component from density 
fluctuations when x a = 1. The vertical dashed lines show the scales corresponding 
to the "Lya horizon" k a and the minimum HII region size fcmi- 

From the form of /3 a in equation (95) and in Figure 9, it is obvious that the 
fluctuations must peak when x a ~ 1; this is clearly visible here as well. The 
saturation when x a ^> 1 implies that fluctuations in the Lya background 
are only relevant over a limited redshift range, so identifying that epoch, and 
studying the power spectrum during it, constrain the parameters of the first 
stars. 

Figure 14 assumes Tk = T^ = 9.25 K, the temperature appropriate for an 
adiabatically expanding IGM with no heat sources other than Compton scat- 
tering. In this case, 5Tb = —100.7 mK at x a = 1. The fluctuations reach 
~ 10 mK on potentially observable scales (k ~ 0.1 Mpc), implying ~ 10% 
variations across the sky. Because the rms fluctuations scale with ST^, these 
can be easily converted to other scenarios with uniform heating. Note then 
that, if Tk ^> Ts, these fluctuations will be significantly harder to observe. 

The above calculations assumed linear fluctuations. Of course, this may not be 
a good approximation in all cases, especially if the sources are highly clustered 
[130]. For instance, for Pop III stars with large escape fractions (which have 
large ratios of ionizing photons to Lyn photons), recombinations near the edges 
of the HII regions would produce copious numbers of Lya photons and could 
create strong coupling near the edge of the region. 

To this point we have included only the coupling induced by stellar radiation; 
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any associated X-ray component will also produce Lya photons and hence 
modify the coupling (see eq. 81). As we will see below, the mean free path of X- 
ray photons is typically a strong function of energy, so they seed smaller-scale 
fluctuations and shift the observed power to larger /c-modes [129]. Imaging the 
regions around the first stars would offer even more powerful constraints on 
their spectra [127,128,324], although it is far beyond the capabilities of any 
planned instruments. These objects would appear as HII regions surrounded 
by zones of strong emission. Close to the star, soft X-rays heat the gas, and 
Lja photons (either produced directly by the the stars or through X-ray exci- 
tation) couple the spin and kinetic temperatures. Farther out (where heating 
is weak but coupling still strong), absorption will dominate until the gas even- 
tually fades into invisibility. In either case, the Wouthuysen-Field coupling 
seeded by X-rays is not a small perturbation and must be included in realistic 
calculations; it modifies the shape of the power spectrum on relatively small 
scales [128,129]. Moreover, in most circumstances these fluctuations cannot be 
cleanly separated from those in Tk, and the two must be considered together 
(see below). 



7.2 X-Ray Pre-ionization 

In order to see the high-redshift IGM in 21 cm emission, a large fraction of the 
gas must be heated without becoming highly ionized. An X-ray background fits 
the bill perfectly: with their large mean free paths, X-ray photons can pervade 
the bulk of the IGM and provide a fairly uniform heating source even far away 
from galaxies. On the other hand, once x e > 0.1, 23 secondary ionizations be- 
come unimportant and the bulk of the primary electron's energy increasingly 
goes into Coulomb collisions (see eq. 69); an IGM heated by X-rays will there- 
fore remain predominantly neutral [172]. As previously mentioned in §3.2.1, 
possible X-ray sources include supernovae (which produce both free-free and 
inverse Compton emission), X-ray binaries, and mini-quasars. X-rays are thus 
an inevitable byproduct of any of the luminous sources that can source reion- 
ization. We already discussed their most important aspect, X-ray heating, in 
§3.2.1. Here we describe three other aspects relevant to 21 cm observations: 
the increased coupling between T K and T s in a warm, partially ionized IGM, 
fluctuations in the heating rate, and observational constraints on an early 
X-ray background. 

The importance of an X-ray background in promoting collisional coupling 
should be immediately obvious from Figure 2. First, the rate coefficient 
(from H-H collisions alone) rises steeply at low temperatures; thus, even a 



3 In this section, we will use x e to denote the mean ionized fraction outside of HII 
regions. 
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small amount of X-ray heating will sharply increase the amount of collisional 
coupling. Second, in the range 100 K < Tk < 3000 K, the H-e~ collisional 
coupling rate is ~ 20 times larger than the H-H collisional coupling rate. 
Thus, as noted by [325], if x e > 5%, H-e~ collisions could predominate. This 
is particularly important because a moderately warm IGM is typically not 
too far from x c ~ 1 anyway. For example, if Tk ~ 500 K in a purely neutral 
medium, an overdensity 5 > 8[(1 + z)/15]~ 2 is required for > 1 to unlock 
the spin temperature from the CMB; thus, even in such a medium, filaments 
will be visible in emission. However, if it is ionized, then xf 1 ps 2(x e /0.1)xf H , 
and the critical density for coupling decreases to of order the mean density 
5 ~ 1. 

The right hand panels of Figure 13 illustrate how non-uniform X-rays can 
affect the spin and brightness temperatures [186]. In this simulation, a mini- 
quasar X-ray source has been placed at the "X" near the center of the box. 
The contrast with the left-hand panels, which lack a mini-quasar and in which 
only dense filaments are visible (see the discussion in §6.2), is stark. The 
relatively strong X-ray emission has heated the gas to ~ 2800 K and set 
x e ~ 0.03 in the box; thus collisional coupling becomes quite strong in even 
slightly overdense filaments (especially near the mini-quasar). This leads to a 
dramatic increase in the magnitude and covering fraction of 21 cm emission; 
indeed, because the local density determines the coupling strength (and the 
ionized fraction, provided that the gas is dense enough for photoionization 
equilibrium to apply), even a uniform X-ray background increases the 21 cm 
contrast between filaments and voids [325]. The resulting fluctuations have 
magnitudes of a few mK on arcminute scales. 

Although illustrative, this calculation did not include the Lya background 
generated by X-rays (through collisional excitation; see eq. 81). Such photons 
are usually not produced in sufficient quantities to drive x a — > 1, but they 
can seed strong enough coupling to substantially increase the 5T b fluctuations 
[127,128], especially near sources (where most of the soft X-rays are absorbed). 

Thus, even in the absence of a UV background, an early X-ray background 
can still drive T s — > T K , and some such scenarios are plausible and even 
attractive in some ways [163,326,327]. Interestingly, in the limit of uniform 
radiation backgrounds, adding Wouthuysen-Field coupling actually decreases 
the contrast of filaments because x tot is much more uniform. On the other 
hand, as discussed in §7.1, the UV background varies, which seeds associated 
21 cm fluctuations. By permitting T s — > T K coupling in regions where the 
UV flux is sub-critical, X-ray induced collisional coupling could modulate or 
smooth out such fluctuations. 

Because of their long mean free paths, X-ray photons are often portrayed 
as providing a uniform background. This is of course not strictly true. The 
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comoving mean free path of an X-ray photon with energy E is: 

AxK4 , 4 / 3 (i±£)- 2 (_^_) 3 Mpc; (116) 

thus, the universe will be optically thick to all photons below ~ 1.8[(1 + 
z)/15]^ 2 x^ keV. By comparison, a photon produced just redward of Ly/? 
can travel a comoving distance 

K » 330 V2 Mpc (117) 

before redshifting into the Lya resonance. Thus, soft X-ray photons will fluc- 
tuate on relatively small scales but, because of the steep energy dependence, 
there will be a uniform component to the X-ray background (unlike in the 
UV). For a spectrum with vL v ~const (such as the nonthermal component 
observed in nearby ultraluminous X-ray sources), the component uniform on 
~ 5 Mpc scales provides ~ ln(2000/300)/ln(300/13.6) ~ 60% of the heating 
compared to the fluctuating component. Another consequence of Ax oc E 3 is 
that the spectrum will harden significantly as one proceeds away from the 
clustered ionizing sources. Nonetheless, many of the causes of fluctuations in 
the Wouthuysen-Field effect (1/r 2 weighting; clustering of sources) apply with 
equal force to X-rays. 

Fluctuations from inhomogeneous X-ray heating can be computed using a 
transfer function Wx{k) that parameterizes the perturbations in the heating 
rate relative to the density field; the only substantive difference from the Lya 
case is that Tk depends on the entire history of heating while x a depends only 
on the instantaneous flux at the Lya transition [129]. Fluctuations sourced 
by Tk have one unique property: the cross-power between temperature and 
density can be negative if T K < T 7 (because then f3 T < 0). Physically, dense 
gas (which tends to sit near luminous sources) is warmer than average and 
has a smaller 5Tf, than average. Detection of such a feature would provide a 
clear indication of a cold IGM with substantial temperature fluctuations. 

Figure 15 shows the resulting power spectra (including both the spherically- 
averaged component in the top panels and the fj 2 component in the bottom 
panels) in a model similar to our fiducial Pop II history from §3.5.2 [129]. At 
high redshifts, z > 18, Wouthuysen-Field fluctuations dominate and the power 
spectra appear similar to those in Figure 14. X-ray heating begins to kick in at 
z ~ 17 and immediately has a dramatic effect. P M 2, which contains a term Ps T s, 
becomes negative on intermediate scales, where temperature fluctuations are 
strong. The spherically- averaged power remains positive (as it must), but it 
contains two distinctive troughs separating the regimes where density, temper- 
ature, and Lya fluctuations dominate (from small to large scales). The peak 
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Fig. 15. Power spectra of 5T& including fluctuations in both Tk (from X-ray 
heating) and the Wouthuysen-Field coupling. The upper panels show the spher- 
ically-averaged power, while the lower panels show the /x 2 component. We show 
several different redshifts in a reionization history similar to our fiducial Pop II 
model of Fig. 7. In the lower panels, thick curves denote a positive signal and thin 
curves a negative signal. The thin curves in the top panels also show the fluctuations 
with uniform heating at z = 19 and 14 (left and right, respectively) for reference. 
From [129]. 

on the largest scales eventually disappears (once x a ^> 1), and the troughs 
disappear entirely once T K > T 7 (at z ~ 14 here). 

As the mean IGM temperature increases, X-ray fluctuations become less and 
less important (eventually disappearing once T$ ^> T 7 everywhere). However, 
they can still be substantial even after the peak-trough component disappears, 
especially when T K is still relatively close to the CMB temperature: compare 
the dotted z — 14 curve to the thin solid curve in the top right panel, which 
shows the expected fluctuation amplitude if the heating is completely uni- 
form. However, at least in this scenario, these fluctuations still precede those 
generated by HII regions during reionization and can be separated cleanly. 
Measuring the 21 cm power spectrum in this regime would reveal the proper- 
ties of the first X-ray sources (and especially their bias and spectra) and may 
provide our best hope of constraining the thermal history of the IGM [129]. 

What observational constraints can we place on high-redshift X-ray emission? 
The most obvious limit is the present-day soft X-ray background (SXRB), part 
of which could originate from a high-redshift hard X-ray background (> 10 
keV) that free streams until today [258,259,326]. Approximately 94l^% of 
the SXRB has been resolved [328]; the mean and maximum intensity of the 
unresolved component is (0.35,1.23) x 10~ 12 erg s _1 cm~ 2 deg~ 2 respectively 
in the 0.5-2 keV bands. Unfortunately, constraining X-ray reionization from 



97 



the unresolved X-ray background (XRB) requires an uncertain extrapolation 
of the spectral energy distribution to hard energies. For instance, in mini- 
quasars, most of the UV and soft X-rays photons are thought to come from 
an accretion disk, while hard X-rays are part of a power-law tail produced 
through synchrotron/inverse- Compton emission. The relative contribution of 
the two components is extremely uncertain. 

Fortunately, we can perform a simple order-of-magnitude estimate that roughly 
matches the more detailed constraints of [258,259]. Suppose the high-redshift 
XRB is emitted at a median redshift z by a population of black holes with 
comoving mass density pbh and radiative efficiency e, with a fraction /hxr 
of the radiation emerging in the [0.5-2] (1 + z)keV range. The comoving en- 
ergy density in hard X-rays is phxr ~ 6Pbhc 2 /hxr and the flux received at 
earth (in energy units) is J = c/(47t)phxr/(1 + z). On the other hand, the 
number of ionizing photons per baryon produced by these same black holes 
is iVi on = e/uv/nfe(pBHC 2 / (E)), where /uv is the fraction of the bolometric 
luminosity that emerges above 1 Rydberg and (E) is the average energy per 
ionization. Then, the SXRB observed at the present day is 




J x ^3.2 x 10~ 12 

x (y^~) ergs -1 cm~ 2 deg" 2 , (118) 

where /hxr/Ajv and (E) are appropriate for a spectrum with L u oc v~ x rang- 
ing from 13.6 eV to 10 keV. Thus, the XRB produced if quasars or mini-quasars 
alone reionized the Universe probably violates observed limits, although there 
are considerable uncertainties. It certainly does not preclude a significant con- 
tribution (iVion ~ few) to the observed WMAP optical depth. 24 Certainly, 
the amount of IGM preheating required for 21 cm emission will not violate 
observed constraints: 



Jx~2.7 x 10 



-14 



r K \ /7hxr\ /0.13 s 



1500 KJ V/sxr/ \fx,h. 



x (y^~) ergs lcm 2< ^ eg 2 ' (119) 

where /sxr is the fraction of bolometric luminosity emerging in the relevant 
0.3-2 keV bands and f x ,h is the fraction of the primary electron's energy which 
goes into heat (this depends on the ionization fraction; see eq. 69). 

Observational constraints on the X-ray luminosity associated with high-redshift 



24 Note that this argument can be adapted for other emission mechanisms by ad- 
justing /hxr/Ajv and (E). 
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star formation are less tight. In principle, the SKA will be capable of detect- 
ing synchrotron radiation from high-redshift sources [163] . Since the same rela- 
tivists electrons producing synchrotron emission also inverse- Compton scatter 
CMB photons to X-ray energies, such detections could place a bound on X- 
ray emission from supernova remnants given reasonable assumptions about 
the magnetic fields in supernova remnants: Lx/L sync = u^/ub, where w 7 and 
Mb are the energy densities in the CMB and magnetic fields respectively. In- 
triguingly, the expected amount of high-redshift star formation can also pro- 
duce a gamma-ray background comparable to the unresolved component from 
EGRET [163], a scenario which could be tested by GLAST. Unfortunately, 
this technique would not constrain other sources of X-ray photons, especially 
X-ray binaries. 



8 Reionization 

We will now consider perhaps the most exciting aspect of the 21 cm signal: 
its potential to teach us how and when the Universe was reionized. We have 
already reviewed the basic physics driving Xi(z) in §3.4. Here we will focus on 
the "geometry" or "topology" of reionization (i.e., how the neutral and ionized 
gas was distributed at a given Xj), because the 21 cm line provides by far the 
best avenue to study this crucial aspect. 

8. 1 Simulations 

Because the physics governing inhomogeneous reionization is so complex, nu- 
merical simulations are the ideal way to approach it, and there has been a 
great deal of interest in this possibility over the past decade [108, 109, 198, 
200,204,329-333]. The basic requirements are (i) code to evolve the density 
field, (ii) a prescription assigning ionizing source parameters, and (iii) code 
to compute the radiative transfer of ionizing photons through the gas density 
field. 

The first component solves for the growth of structure in an expanding uni- 
verse; it is of course necessary for any cosmological simulation, so the relevant 
codes are now well-established. The principal challenge for reionization studies 
is the computational expense of hydrodynamics. Thus, often the gas dynam- 
ics are ignored and pure N-body codes, which track only the dark matter, are 
employed [108,109,331,332]. Although this allows significantly larger volumes 
to be studied, it requires the gas distribution, and especially star formation 
rates and small-scale clumping, to be prescribed in a similar way to analytic 
models. 
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The second component is an algorithm to assign luminosities to ionizing 
sources. As we have emphasized in §3.4, any such procedure is fraught with un- 
certainties because we know so little about high-redshift galaxies. Simulations 
are typically calibrated to the observed properties of z < 3 galaxies, supple- 
mented by an extrapolation to higher redshifts. Of the parameters that deter- 
mine the ionizing efficiency ( (see eq. 83), simulations can in principle most 
improve our estimates of /* because they follow the gas as it collapses to high 
densities (though this is more difficult in pure N-body simulations, of course). 
Unfortunately, the simplest prescription (the so-called Schmidt law [334]), 
which is motivated by observed star formation rates in nearby galaxies [335], 
tends to overproduce stars (hardly surprising even to its creator, given its sim- 
plified assumptions; [336]). It must therefore be supplemented by prescriptions 
for feedback regulation of star formation - internal, external, or both, in the 
language of §3.4.3. Examples include adding supernova winds carrying gas par- 
ticles out of galaxies or introducing a multiphase interstellar medium on the 
subgrid level (e.g., [337]). The resulting models are then calibrated to specific 
local observations (such as the star formation rate-density relation [335, 337] 
or the mean star formation rate [329]). Modern codes produce reasonable 
agreement with a wide range of observations at z < 4 (e.g., [338,339]). How- 
ever, the star formation and feedback parameters may evolve with redshift, 
which could have important implications for reionization (e.g., [340]). Unfortu- 
nately, even with self-consistent star formation rates simulations are no better 
than analytic models at determining A^ on or / esc , which are generally pre- 
scribed constants. (Although the metallicity can, to some extent, be followed, 
that has not been factored into the ionizing efficiencies in existing simula- 
tions.) Thus simulations suffer from the same systematic uncertainties about 
the source population as analytic models. The range of source prescriptions, 
from purely analytic [341,342] to semi-analytic [108,109,331,343] to "fully" 
numeric [198,329], illustrates the diversity of approaches to these problems. 

Finally, radiative transfer is a cutting-edge problem that has received a great 
deal of attention lately. Computing the specific intensity I u (t, x, n, v) requires 
solving a seven-dimensional problem: time t, position x, frequency v, and 
direction of propagation n. Furthermore, simulations can contain hundreds 
of thousands of sources, even excluding the diffuse light generated by IGM 
recombinations. Thus the complete problem is prohibitively expensive, and 
approximate schemes are necessary. Existing approaches include the "local 
optical depth approximation" [204,329], the "optically thin variable Eddington 
tensor" approximation [200,344], adaptive ray tracing [108,198,330,333,345], 
and Monte Carlo techniques [331,332,346]. We will forego the details of these 
various approaches and refer the interested reader to the individual papers 
for more information. For us, the crucial result is that, for the most part, the 
existing codes yield reliable and consistent results [347]. 

To further complicate matters, feedback demands that these three compo- 
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nents interact with each other: supernova winds and photons affect the gas 
distribution, which affects the halo (and hence star) formation rate, which 
affects the ionization rate, etc. Prescriptions for mechanical feedback are now 
incorporated into many simulations, but the results can be quite sensitive 
to uncertainties in the observations and implementation, especially at high 
redshifts [340]. Including photoheating requires the radiative transfer to be 
incorporated directly into the simulation. This is possible [329], but in most 
cases radiative transfer is added as a post-processing step [198,330] and hence 
has no dynamical effect. Of course, whenever N-body simulations are used, 
this kind of feedback is implicitly ignored anyway, because the photons can- 
not interact with dark matter particles [108,109,331]. Thus, at least for now, 
the feedback processes described in §3.4.3 must be explored by analytic or 
semi-analytic means. 

As in most cosmological applications, the biggest computational challenge is 
resolving all the relevant structures while also subtending a large enough vol- 
ume to be representative of the Universe as a whole. The most obvious criterion 
for a "representative volume" is that the density fluctuations on the scale of 
the box be small. This is, of course, easy to check and also relatively easy to 
achieve: at z ~ 6, the nonlinear mass scale corresponds to ~ 10 6 M . With 
this criterion in mind, most of the first generation of reionization simulations 
used boxes < 10/z 1 Mpc on a side [198,204,329,330,333] (or 20/i 1 Mpc with 
pure dark matter codes [331,332]). Some of these simulations could resolve all 
galaxies with T vir > 10 4 K, although none included minihalos. 

These simulations focused on understanding the global evolution of reioniza- 
tion, tracking X{(z) for a given source population, and on the "breakout" phase 
of ionization fronts around galaxies. But they also began to address questions 
about inhomogeneous reionization by observing where HII regions appeared 
and how they grew. Unfortunately, it quickly became obvious that these vol- 
umes were insufficient for answering many questions about reionization. One 
problem is that high-redshift galaxies are so highly biased that even these large 
boxes missed many of them [348] . A second difficulty, more important for our 
purposes, is that (again because of clustering) reionization in each box was 
driven by just a few clumps of sources. Thus all of these simulations sampled 
only a few HII regions and could not adequately model the inhomogeneity. A 
third problem, relevant to 21 cm studies, is that realistic experiments will have 
angular resolutions of several comoving Mpc - comparable to the total sizes 
of these simulations! This motivated the development of the analytic models 
that we will describe in §8.2. 

As a result, attention has recently focused on performing reionization simula- 
tions in much larger boxes - with sides > 100 Mpc [108,109,200]. These must 
use pure N-body codes, and even then the dynamic range is not good enough 
to resolve all of the collapsed halos, so many ionizing sources are missed (es- 
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pecially if reionization is tuned to occur at high redshifts, when the nonlinear 
mass scale is much smaller). Moreover, the dumpiness is completely unre- 
solved (especially without hydrodynamics). Fortunately, both of these prob- 
lems can be overcome, at least approximately, through analytic models or by 
"bootstrapping" from smaller boxes where the relevant structures are better 
resolved. For example, grid-based radiative transfer codes (such as ray-tracing 
algorithms) require the clumping factor and emissivity of each grid cell. These 
can both be extracted from comparable regions in higher-resolution simula- 
tions [200, 205] (note, however, that dumpiness measured in this way cannot 
include the back-reaction of reionization itself, which can be substantial - see 
§3.4.1). The hope, of course, is that these approximations will not affect the 
large-scale features of reionization. Given the resolution of 21 cm experiments, 
this is probably not a bad approximation. 

The leftmost column in Figure 16 shows an example of a reionization simu- 
lation in a (65.6/i _1 Mpc) 3 N-body simulation taken from [109] (which uses 
the radiative transfer code of [198]). The ionizing efficiencies were assigned 
so that reionization ends slightly before z — 6. It resolves galaxies with 
M > 2 x 10 9 M and underestimates the recombination rate by ignoring 
small-scale clumping. Several crucial qualitative features are apparent in Fig- 
ure 16. First, the ionized regions rapidly attain large sizes. At z — 7.68, when 
X{ = 0.35, the characteristic bubble size is ~ 3 Mpc, with some bubbles al- 
ready larger than 10 Mpc. The typical size reaches ~ 20 Mpc by z = 6.89. As 
we shall see, this is a direct result of the highly clustered galaxy distribution. 
These large features should make the HII regions much easier to observe with 
21 cm surveys. 

Second, reionization begins in and around the densest regions. In Figure 16, 
the ionized bubbles predominantly appear around large-scale overdensities, 
leaving voids neutral until the end of reionization. This "inside-out" picture 
of recombination is generic to any scenario in which the sources are more biased 
than the recombining gas [22, 198]. 25 Third, the bubbles are initially roughly 
spherical but take more complex shapes as they reach scales > 3/i _1 Mpc. 26 
Nevertheless, many galaxies contribute to each bubble even when they are still 
roughly spherical: evidently, before reionization, the cosmic web plays only a 
secondary role in the large-scale geometry of reionization (although this is 

25 This is not to say that, on a local level, reionization cannot be "outside-in" with 
dense blobs ionized after their low-density surroundings [10,22]. Instead we mean 
that photons originate in dense environments and so must at least ionize the more 
rarefied gas in their local overdensity before escaping toward large voids. In many 
ways, the apparent dichotomy between these two kinds of models is a false one. 

26 Resolving small galaxies is important for this aspect, because they provide the 
small-scale structure visible in Fig. 16: resolution much worse than used here creates 
a "cookie-cutter" morphology in which merged bubbles appear to be built from 
nearly spherical subunits [108,109]. 
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Fig. 16. Slices through a simulation of reionization at z = 8.16, 7.68, and 6.89 
(top to bottom); these have X{ = 0.13, 0.35, and 0.55, respectively. The colorscale 
is proportional to the HI density. The slice is 65.6/i~ 1 comoving Mpc across and 
0.25/i _1 Mpc deep. The left column uses a radiative transfer code; the right column 
applies a slightly modified version of [341] to the same density field. The middle 
column applies that ionization criterion to the overdensity of bound halos. The 
points inside the HII regions in the left and middle columns identify the ionizing 
sources. From [109]. 
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partly a product of the coarse resolution of the simulation). 

Computing the 21 cm signal from a simulation requires one additional piece 
of information: the spin temperature. Unfortunately, as we have seen in §3, 
this depends on the UV and X-ray emissivities of the sources, which must be 
prescribed based on some analytic model. (No reionization simulation has at- 
tempted radiative transfer for these components.) A few do try to compute the 
evolution of all three quantities self-consistently on a global level [184,200,349], 
but most simulations assume that T s ^> T 7 throughout the box. Although ex- 
tremely simplistic, this is most likely a reasonable assumption for xi > 0.1 
(see §3 and [271]). In this limit STf, becomes independent of Ts, and the 21 
cm signal can easily be computed from the simulation outputs. Figure 17 
shows slices through the largest calculation to date (an N-body simulation in 
a 100/i^ 1 Mpc box) [205]. The slices are ~ 49' across; the two panels effec- 
tively assume different ionizing efficiencies and hence have reionization end at 
different times. The lower panel also imposes subgrid dumpiness that evolves 
with time (eq. 85), slowing down reionization. Because the 21 cm signal essen- 
tially traces the neutral gas density, the maps look quite similar to those of 
Figure 16: relatively weak fluctuations from the cosmic web in predominantly 
neutral gas and significantly stronger fluctuations from HII regions. This is 
true even though the assumed reionization redshifts are quite different: as we 
will see below, the qualitative features of reionization at a fixed only 
weakly dependent on redshift. Note, however, that toward the end of this sim- 
ulation, the bubbles "percolate" and the topology inverts itself, transforming 
from a "Swiss cheese" universe composed of neutral gas with ionized holes to 
a sea of ionized gas with islands of neutral material. This behavior is generic 
to any percolation phenomenon [350]. 

8.2 Analytic Models 

Given its intrinsic complexity, reionization may seem a difficult problem for 
analytic models. But they are useful for several reasons. Most importantly, 
they give us physical insight into how the seemingly complex global ionization 
pattern is generated: as we will see, to first order it is shockingly simple. 
Second, they have essentially infinite dynamic range, avoiding many of the 
resolution problems of the simulations. Third, they allow us to fill parameter 
space efficiently, which is crucial given the large uncertainties in the input 
parameters. Finally, simulations themselves require analytic approximations 
for the sources and clumping, so it is best to understand them in detail! 
Figures 16 and 17 give us hope that such models might be successful: in them, 
reionization is driven by the large-scale clustering of ionizing sources - which 
can be described by linear theory. This allows us to construct simple models 
for the size distribution of HII regions as a function of Xj. While these models 
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Fig. 17. The evolving brightness temperature of the 21 cm transition in two simula- 
tions of reionization (tuned so that reionization occurs at different times). The box 
subtends ~ 49' in the x-direction. From [205]. 

must ultimately be compared to numerical simulations, they provide a great 
deal of intuition about the 21 cm signal. 

The key ingredients to a model of HII regions are their number density as a 
function of mass n&(m), their clustering strength, and their correlation with 
the underlying density field. The only possible solution that approaches any- 
thing like "first principles" is to examine the growth of HII regions around 
individual galaxies [351,352]; unfortunately, this model problem is probably 
only relevant for the first galaxies, before clustering becomes significant. One 
simple generalization is to assume that bubbles trace dark matter halos but 
to allow for clustering in an approximate way. For example, specifying Xi(z) 
and the characteristic bubble size R c (z) determines rib uniquely, and the bub- 
bles can then be naturally associated with dark matter halos of that number 
density [295]. (Of course, the associated halos need not provide all the ioniz- 
ing photons for the bubble; their neighbors could contribute as well.) Such a 
model takes advantage of the highly-developed halo-clustering machinery to 
describe the bubble pattern [102], but it requires R c (z) to be specified "by 
hand." 

A more satisfying approach is to estimate R c {z) on physical grounds. The 
key ingredient is the source clustering, which implies that overdense regions 
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(with many sources) will be ionized before underdense regions [348] . We begin 
by assuming that Xj = C/coib where ( is the ionizing efficiency defined in 
equation (83). 27 Consider an isolated (large) region of the IGM with mass 
m and mean fractional overdensity 5. It is fully ionized if the local collapse 
fraction f co ii(8, m ) satisfies 

/ co ii(<5,m) >C\ (120) 



where for the Press-Schechter mass function 28 [101] (c.f. eq. 5) 

,n) = eric { *f?- 5 ' D{Z ] , }. (121) 
{ V2 [a{m min ) - a(m)\ J 



Thus, for isolated regions, we can restate the ionization condition as a criterion 
on the local smoothed density field. 

We would like a method to compute the statistical distribution of regions 
satisfying this condition, as a function of mass. In that case we cannot treat 
each region in isolation: a low-density void lying near a cluster of sources 
will be ionized by its neighbors. Interestingly, stated this way the problem is 
similar to the "excursion set" derivation of the Press-Schechter mass function 
[189,353]. In that case, the critical overdensity S c (z) describes the condition for 
virialization: any region with 5 > S c (z) is part of a halo. But this prescription 
also suffers from a problem similar to our void/ionizing cluster situation (here 
it is known as the "cloud-in-cloud" problem"): a point can be part of many 
regions with 5 > 5 c (z) (on different mass scales). For halos, as for HII regions, 
only the largest of these is physically relevant, because it incorporates all of 
the smaller ones. The excursion set formalism treats the problem as diffusion 
in the (a 2 , 5) space [or equivalently (m, 5)} with an absorbing barrier at S c (z); 
here a 2 plays the role of time (because the rms density fluctuation increases 
monotonically toward smaller scales) and 5 plays the role of space. Posed in 
this way, one can compute the distribution of "crossing times" - or masses - 
from which the halo mass function (eq. 5) follows immediately. This procedure 
avoids the cloud-in-cloud problem by following trajectories from large to small 
mass scales (or increasing a 2 ) and assigning each diffusion trajectory to the 
largest halo of which it is a part (this is why the barrier is absorbing). 

The HII bubble problem has an identical structure except that the absorb- 
ing barrier S x comes from the condition / co ii(5, m) = (~ 1 (instead of simply 

27 This is not strictly necessary [342], but it makes the argument simpler. For now 
we will ignore recombinations. 

28 Note that the following conclusions do not depend on the precise form of the mass 
function [342]. 
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equaling 8 C ) and is hence a function of mass. Fortunately, this barrier is well- 
approximated by a linear function in a 2 , 5 x (a 2 ) ~ B(a 2 ) = B + Bi<7 2 , 29 
which allows an analytic solution for the mass function [341,355,356] 



mn h {m) dm 
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(122) 



Note the similarity to the Press-Schechter halo mass function [101,189]. As 
a consequence of this derivation, most of the machinery used for halo mass 
functions, clustering, etc. can be carried over to HII regions. For example, the 
linear bias of ionized bubbles, defined so that nb(m\5) = n^m) [1 + b x (m) S] 
in a large region of mean overdensity 5 [357-359], is [356] 

M^h W-'-W . (123) 

D{z) 



The solid curves in Figure 18 show the resulting size distributions for a range of 
Xi at z = 15; the ordinate is the fraction of the ionized volume filled by bubbles 
of a given size. The results are similar to those from the simulations [109]: 
most importantly, bubbles grow large during the middle stages of reionization, 
with characteristic sizes R c ~ 1, 4, 10, and 30 comoving Mpc when = 
0.2, 0.4, 0.6, and 0.8. 

Of course it is possible to implement more sophisticated source prescriptions. 
One possibility is to explicitly associate ionizing sources (either stars or black 
holes) with halo mergers; the results differ in detail but are qualitatively similar 
[360]. The dashed lines in Figure 18 show another, which assumes that ( oc 
mf/ 3 . In this case massive galaxies provide a larger fraction of the ionizing 
photons, and R c increases [342]. To understand why, note that R c is the scale 
at which a "typical" density fluctuation is able to ionize itself; mathematically, 
it is where <j(R c ) ~ B. In the large bubble limit (B ~ B ), our original 
ionization criterion becomes 

(f co ii(5 = B , a 2 = 0) = l. (124) 



Expanding equation (121) to linear order, this can be written 

a(R c ) ^B ^ (125) 



29 Recently, an elegant numerical solution to the diffusion problem confirmed the 
accuracy of this approximation [354]. 
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Fig. 18. HII region size distributions at z = 15 for the model of [341,342]. The solid 
and dashed curves assume £ oc m° h and m 2 ^ 3 , respectively. From left to right within 
each set, we assume aii = 0.05, 0.2, 0.4, 0.6, and 0.8. Recombinations are assumed 
to be uniform throughout the IGM. 

where b e s is the average galaxy bias [359]. Intuitively, a more biased galaxy 
population provides a larger "boost" to the underlying dark matter fluctu- 
ations, allowing larger regions to ionize themselves. Thus, by measuring the 
HII region sizes through the 21 cm transition, we can constrain the galaxies 
driving reionization. 

Two more properties of equation (122) deserve emphasis. First, at a given Xi, 
rib(m) depends only weakly on redshift. This is because the shape of f co ii{S, m) 
evolves only slowly with redshift; quantitatively, D(z)b e s is roughly constant 
for high-redshift galaxies [361]. Second, the width of n&(m) is ultimately deter- 
mined by the shape of the underlying matter power spectrum, which steepens 
toward larger radii [342]. 

Another physically motivated estimate of R c yields similar results [288,362]. It 
differs from this model primarily by imposing at a maximum size determined 
by the light crossing time at the end of reionization. This is only one of sev- 
eral mechanisms that effectively limits the range of "causal contact" between 
sources; the most important is likely to be recombinations. 

Incorporating inhomogeneous recombinations into this analytic model is rel- 
atively straightforward [22]. Each HII region obviously contains density fluc- 
tuations. Because the recombination rate increases like (1 + S n \) 2 , where S n \ 
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is the fully nonlinear fractional overdensity, dense clumps will remain neutral 
longer than voids will. Thus, following [10], we make the simple ansatz that 
there exists a threshold density 5, below which gas is ionized and above which 
it is neutral. 30 Any ionizing photons striking these dense blobs will be lost 
to recombinations in the neutral gas. Thus, for an ionized bubble to continue 
growing, the mean separation of these dense blobs must exceed the radius 
of the bubble. Given a model for the volume-averaged IGM density distribu- 
tion, Py(5 n i), 5i can therefore be computed by requiring the mean free path 
between such regions to equal the bubble radius. Clearly it increases as the 
bubble grows - so denser and denser gas is ionized. But this will also increase 
the recombination rate per proton, which is 



A rcc = a{T)n e {\ + 5) J d5 nl P v (5 nl ) (1 + <5 nl ) 2 (126) 
-i 

= a{T)n e C{5, R), 

where C(S, R) is the local clumping factor. The bubble can only grow if ionizing 
photons are produced more rapidly than recombinations consume them; in 
other words if 

C ^f' R) >a(T)n e C(5, R), (127) 



The crucial point is that C depends on both the mean density of the bubble 
(recall that on large scales overdense regions are ionized first in this model) and 
on its size (through 5i) - as expected from §3.4.1, inhomogeneous reionization 
affects the clumping factor. 31 Recombinations become increasingly important 
as bubbles grow; eventually they balance ionizations and the bubbles saturate, 
becoming true cosmological Stromgren spheres. 

Equation (127) complements our original ionization condition, equation (120), 
which requires that the cumulative number of ionizing photons exceeds the to- 
tal number of hydrogen atoms. Of course, in reality both conditions must be 
fulfilled, but in practice one of the two generally dominates [10,22]. (This is 
essentially because recombinations take over only when 5{ approaches the char- 
acteristic density of virialized objects, or in other words when "Lyman-limit" 
systems dominate the mean free path, as in the lower-redshift Universe.) As a 

30 Of course, this cannot be exactly true, because galaxies are embedded in dense 
filaments, so ionizing photons do not immediately reach the voids. This model im- 
plicitly assumes that these recombinations are incorporated into / csc (see [200] for 
a similar application to simulations). 

31 Note that this model is therefore both "inside-out" on large scales and "outside- 
in" on small scales. 
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consequence, it is possible to combine the two conditions in the excursion set 
formalism and compute the "bubble" sizes including recombinations. However, 
we must keep in mind that in this case the model describes the mean free path 
of ionizing photons. When recombinations are unimportant, this equals the size 
of isolated bubbles. But once the bubbles "saturate" as Stromgren spheres, 
neighboring HII regions can touch - it is only that their ionizing photons 
will not influence each other. The model therefore describes how the "bubble- 
dominated" topology characteristic of reionization maps smoothly onto the 
"web-dominated" topology of the post-reionization IGM. 

The key input parameter is obviously Pv(S n \), which describes the IGM dumpi- 
ness. As we have emphasized repeatedly, this is difficult to specify because it 
depends on reionization itself. The distribution has been measured in simula- 
tions at z ~ 2-4 [10], but they assumed the gas to be highly ionized and hence 
smooth on the Jeans scale of photoheated gas. While it is still neutral, and 
even shortly after it is ionized, the gas could be much clumpier - although 
how much depends on the unknown X-ray heating rate [321]. If minihalos 
form, they could dramatically increase the dumpiness [201]. However, recent 
numerical simulations [199,202,363] and analytic models [203] show that mini- 
halo evaporation occurs quickly enough that they do not have an enormous 
effect on reionization. Thus for now the most widely-used choice is the fit to 
lower-redshift simulations by [10]. 

Figure 19 shows some example bubble size distributions with this Py(5 n i). The 
axes are identical to Figure 18, and the thin curves show the same model as 
in that figure (with ( oc m° h ). The thick curves (with the filled hexagons) add 
recombinations as described above. They have only a modest effect on rib(m) 
when Xi < 0.75, truncating the distribution's tail but not affecting most bub- 
bles. This is because small bubbles need not ionize any of their dense gas; it 
is only unusually large bubbles that ionize dense clumps deeply enough for 
recombinations to become significant. However, as typical bubbles grow be- 
yond ~ 20 Mpc, recombinations completely dominate their size distribution, 
imposing a well-defined i? max on it. Qualitatively, at this point the typical size 
of an ionized bubble equals the typical separation of Lyman-limit systems. 
Thus most ionizing photons are absorbed by one of these objects before hit- 
ting the edge of a bubble; Lyman-limit systems are so dense and thick that 
they cannot easily be "burned off" and so impose a sharp limit on the bubble 
sizes. This saturation radius depends sensitively on the assumed IGM den- 
sity distribution; measuring it would thus constrain the growth of small-scale 
structure. It is also sensitive to the evolution of the ionizing emissivity [342]. 

Of course, this approach must ultimately be validated against simulations. For- 
tunately, because the model is constructed from the density field, it is easy to 
compare it to specific simulations in a detailed, and not simply statistical, sense 
[109,286,364]. We return to our basic ionization criterion, C/coii(^^) > 1- 
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Fig. 19. HII region size distributions at z = 8 for the models with and with- 
out inhomogeneous recombinations (thick and thin curves, respectively). The dot- 
ted, short-dashed, long-dashed, and solid curves assume = 0.41, 0.68, 0.84, and 
0.92, respectively. Recombinations sharply truncate the bubble size distribution at 
R ~ -Rmax- The filled hexagons show the additional fraction of the IGM volume in 
bubbles with R = -R max - From [22]. 

This is a condition on the linear density field: the same field that serves as 
the initial condition for simulations. At any redshift, the linearly-evolved den- 
sity field can be smoothed around each point on progressively smaller scales, 
checking whether the ionization criterion is fulfilled. Each point satisfying it 
(at any scale) is tagged as ionized: points ionized by distant neighbors pass 
the density threshold on large scales, while points ionized by nearby sources 
pass it on small scales. The rightmost column of Figure 16 shows an example 
of this approach in the same box used for the radiative transfer simulation 
shown in the leftmost column [109] . 32 Considering the simplicity of the ana- 
lytic model, the agreement is remarkable. Large HII regions are identified in 
a nearly one-to-one manner (although their detailed shapes differ). Thus the 
basic principles of the model, and the generic predictions we have emphasized, 
appear to be accurate. This scheme allows the rapid generation of large, high- 
resolution reionization histories, even with modest computational facilities, 
and it reproduces the complex geometries of ionized bubbles reasonably well. 

32 Note that recombinations are not included in the analytic version, because small- 
scale clumping was also neglected in the simulation. But they can easily be incor- 
porated in the model by simply using both ionization criteria; in fact, given the 
difficulty of properly adding recombinations to simulations, this is an attractive 
option. 




Ill 



Thus we expect it to provide a powerful tool in studying reionization and its 
observable implications. 

Nevertheless, it is vital to explore the reasons for the disparities with the 
full simulations. One interesting difference is that the simulations tend to 
show larger HII regions appearing in the later stages of reionization [108]; 
this is inherent to percolation processes, where an infinitely large HII region 
containing islands of neutral gas must eventually appear. In the simulations 
this manifests itself through HII regions much larger than expected from the 
naive analytic model in the later stages of reionization (beyond the stage 
shown in Fig. 16), although note that applying the analytic criterion to the 
simulation's density field does correctly describe this topology inversion, so 
it is not a result of new physics. Moreover, we have already emphasized that 
the analytic model is better suited to describe the mean free path of ionizing 
photons in this regime; comparisons during the late stages must await the 
self-consistent incorporation of recombinations into simulations. 

Another difference is that the galaxy population contains stochastic fluctua- 
tions over and above those sourced by density fluctuations. Analytic models 
suggest that, in most cases, the effects of this scatter on n^m) are modest 
once the bubbles become large [342] . But this will certainly affect the detailed 
distribution of ionized gas. One way to address its importance is to apply the 
ionization criterion to the halo density field rather than the dark matter [109]. 
This is shown by the center column in Figure 16, which clearly provides an 
even closer match to the radiative transfer results, especially in regard to the 
connectivity of the ionized regions. This algorithm requires running an N-body 
simulation (to generate the halo field) but eliminates the radiative transfer, so 
it still dramatically speeds up the calculation. Obviously there are a variety 
of approaches to inhomogeneous reionization, of increasing sophistication and 
accuracy, which can be tailored to the task at hand. We expect the continued 
development of reionization models (incorporating, for example, feedback pro- 
cesses [365] and more sophisticated star formation algorithms [360]) to sharpen 
our expectations for studies of reionization. 



8.3 Statistical Predictions 



We now turn more specifically to predictions for 21 cm observations and what 
they can teach us about inhomogeneous reionization. We will begin with sta- 
tistical measurements, because they are most likely to be accomplished first. 
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8. 3. 1 The Isotropic Power Spectrum 



The simplest possible statistic is the power spectrum, neglecting peculiar ve- 
locities and other factors that introduce anisotropies into the signal (see §4.1 
and 8.3.2). This requires computing the power spectrum of ionized bubbles, 
which was first written down in the context of "patchy reionization" CMB 
anisotropies [366,367]. A number of analytic models have been presented in 
the literature, including simple tracers of the density field [368] and direct 
association with dark matter halos [295]. We will first describe how to build 
P 2 i(^) from the physically-motivated reionization model discussed above. This 
provides intuition for more detailed investigations with simulations, which are 
necessary given the complex shapes of real HII regions. 

We expect the two-point function to have the form [341] 



where r = |r — r'|, R c is the characteristic bubble size, f(r/R c ) ps 1 for r <C R c , 
and f(r/R c ) ~ for r 3> R c . This form is necessary because the bubbles have 
a finite size: thus two nearby points will either both be ionized by the same 
bubble (with probability xi) or both be neutral. At large separations, on the 
other hand, the points must be ionized by two distinct bubbles, each with 
probability Xj. 

In constructing / we must take care because has a restricted range from 
zero to unity. Thus Xi = 1 implies Xi — 1 at every point in the IGM, and 
the correlations must vanish in both limits Xi — > (0, 1). This is a particular 
concern for statistical models of the bubble distribution, because it means 
that bubbles cannot overlap and we cannot simply distribute them randomly. 
One solution is to treat the ionization as a Poisson process [295,341], which 
prevents more than one bubble from covering any particular location but has 
the unattractive feature of failing to conserve photons. 

It is therefore more illuminating to try to construct f(r/R c ) directly while 
taking care to enforce the proper limiting behavior. By analogy to the halo 
model for density fluctuations [102], we separate it into two components: the 
probability p\ for which a single bubble ionizes both points and the probability 
p 2 for which two separate bubbles do so [341]. Then 



where V oy (m, r) is the volume within which the center of a bubble of mass 
m (or radius R) can sit while still ionizing both points. Note that p\ ~ xi 
for r <C R c but p 1 xs for sufficiently large r. The two-bubble term can be 



(xi(r) Xi (r')) = x\ + (xi - x?) f(r/R c ), 



(128) 




(129) 
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written [356]: 



P2(r) = J dmin h (mi) J d 3 r J dm 2 n b (m 2 ) 
x / d 3 r'[l + ^(r-r'|mi, m 2 )}, 



(130) 



where £ bb is the bubble correlation function. Clearly P 2 — > as r — > oo. 
The challenge posed by the discrete bubbles is to set the spatial integration 
limits to simultaneously avoid overlap (to force P 2 — > as r — > 0) while still 
allowing bubbles to reside near to each other. This is difficult because of the 
assumed spherical symmetry. (The basic problem is similar to packing a crate 
with oranges: small gaps are invariably left between the bubbles. In reality, of 
course, the HII regions can assume arbitrary sizes to fill space.) 

An approximate solution that matches simulation power spectra reasonably 
well (at least at Xi < 0.8) is [356] 



with £ bb ps bx£ss on sufficiently large scales and £gs the dark matter correlation 
function. This form is intuitively simple: when Xi < 0.5, overlap is relatively 
unimportant and we can use the one- and two-bubble terms as is. At larger 
ionized fractions, overlap becomes significant, but by this point the charac- 
teristic bubble size R c > 5 Mpc, and Cbb(Rc) is sufficiently small that setting 
p 2 ~ xf everywhere suffices. Thus we only need the one-bubble term pi(r) 
(weighted by the neutral fraction to enforce the proper limits). This basic 
form captures most of the relevant features of the correlations (see [288] for a 
simplified version using bubbles of a single size). 

Because 5T b oc xmn, we also require the cross-correlation £ x s between the 
ionized fraction and density. This is implicit in the analytic model, which 
calculates the bubble geometry from the density field itself [341,356]. The 
resulting expressions have similar ambiguities and limiting behaviors to (xjXj). 

Armed with the correlation functions £ xx , £ X( $, and (and the corresponding 
power spectra), 33 the isotropic power spectrum of 21 cm brightness temper- 



33 Note that we write P xx for the power spectrum of 6 X , the fractional perturbation 
in xhi used in equation (92). This differs from the definition in [356]. 




(131) 
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ature fluctuations, P£°, is 34 [341] 

P$?(k) = P S s(k) + P xx (k) - 2P x5 (k) + P xSxS (k); (132) 

to convert to temperature fluctuations, we must multiply this by ST b . The four 
point term P x g x s is in general complicated; we will take P x s x s = P x s + P X x Pss 
by dropping the connected part. 35 

Figure 20 shows several examples. We set z — 10 and vary the ionizing effi- 
ciency ( to examine a range of x,. When Xi <C 1, HII regions are unimportant 
and P2i(k) essentially traces the matter power spectrum (the sharp rise at 
k > lOh Mpc" 1 is due to small scale nonlinearities). When the bubbles ap- 
pear, they first suppress the power by ionizing the highest density regions 
first. Soon, however, P xx becomes large and begins to dominate the fluctua- 
tions, creating a "shoulder" at k p ^. The scale of this feature is directly related 
to the characteristic bubble size, and /c p k decreases throughout reionization; 
moreover, the more sharply peaked the bubble distribution, the more distinct 
the peak. For k < k p ^, P 2 i oc P$g when xi is small (it is simply amplified by 
the effective bias of the bubbles), but once the bubbles become large their 
Poisson fluctuations dominate and P 2 i approaches white noise. This peak is 
probably the single most important feature for 21 cm experiments, because it 
directly and clearly constrains reionization and the sources responsible for it. 
Its detection is one of the major goals of 21 cm observatories. 

Of course, the power spectrum is best measured in simulations of reionization, 
which can capture its complex geometry - although large volumes are obvi- 
ously necessary (cf. [271,369]). Figure 21 compares the power spectra of the 
ionized fraction in full radiative transfer simulations (solid curves) to imple- 
mentations of the analytic model in the same box (using the linear density field 
in the dotted curves and the halo distribution in the dashed curves); these are 
the same fields shown in Figure 16 [109]. Qualitatively the results are similar 
in the three cases: the power spectrum has a well-defined peak that moves to 
larger scales and eventually fades into Poisson noise as Xi increases. We see 
k p k ~ O.Ol-O.l/i Mpc^ 1 over this range, as expected from [341]. However, the 
radiative transfer simulations do have somewhat less power on large scales and 
somewhat more on small scales; the latter is due to fluctuations in the recom- 
bination rate and in the halo distribution. The agreement in P 2 i is similarly 
good [109], and other simulations show qualitatively similar behavior, with a 
peak developing in both A xx [108] and the angular power spectrum of 21 cm 
fluctuations [205] that moves to larger scales throughout reionization. 

34 Here we assume Ts S> T 7 throughout the IGM; if this is note the case, as may 
occur early in reionization, other terms sourced by Ts will appear. 

35 F orm ally, this is only valid for a gaussian field. But comparison to numerical 
simulations show that it is a reasonable approximation [356]. 
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Fig. 20. Rms variation in the 21 cm brightness temperature as a function of 
wavenumber at several different stages of reionization, ignoring peculiar velocities. 
All the curves assume z = 10. They have X{ = 0.13 (dotted curve), xi = 0.36 
(dash-dotted curve), Xi = 0.48 (short dashed curve), Xi = 0.69 (long-dashed curve), 
and Xi = 0.78 (solid curve). Following [278,341]. 

Thus, although numerical implementations are clearly required to produce 
detailed power spectra (and by extension other statistical measures), the an- 
alytic model, especially as implemented in realistic density fields, can provide 
a great deal of intuition about the signal. For example, measuring the shape 
of P21 will teach us about the sources of reionization (through their cluster- 
ing) [342], the role of recombinations [22], feedback during reionization, and 
even whether X-rays have uniformly ionized the IGM [370]. 

8.3.2 Anisotropics 

The velocity field makes P21 anisotropic (see §4.1) and hence considerably more 
complicated. Using the linear theory approximation Sg v = —fi 2 f5L, where 5l 
is the linear density field, we can write (with / = 1) 36 [278] 

P 21 (k) = P£(k) + 2f, 2 [P SL s(k) - P xSL (k)] 

+APs L s L (k)] + 2P xSSLX (k) + P x5L 5 L x(k). (133) 



36 In practice Ps L s L and P$s are identical on the scales accessible to experiments, 
but we separate them in order to label those terms arising from velocities. 
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Fig. 21. Comparison of dimensionless power in the ionization fraction between a 
simulation and the analytic model. The solid, dashed, and dotted curves correspond 
to the three different columns in Fig. 16. From [109]. 



The first three terms in equation (133) show the fi°, fi 2 , and // dependen- 
cies originally derived by [277], but the last two terms are considerably more 
complicated. Although they appear to be higher order, they must be included 
because S x is not small during reionization, so these four-point terms are still 
only second order in 5. Unfortunately, they have nontrivial fi dependence; for 
example, 
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Fig. 22. 21 cm brightness temperature power spectra at four different stages of reion- 
ization. In each panel, the solid, dash-dotted, and dashed curves take fi = 0, 0.5, 
and 1, respectively. The model follows [341] with ( = 12. From [278]. 



xSSlx 



'k) = 



d 3 k' 



h-k') 2 [P x s L (k')PA\k-k'\) 



(2tt) 3 

+P SSL (k')P xx (\k-k'\)}. 



(134) 



Thus, when they are large, they will ruin the decomposition in powers of // 
proposed by [277]. 

Figure 22 shows the 21 cm power spectra at several stages of reionization for 
/i — 0, 0.5, and 1 (solid, dash-dotted, and dashed curves, respectively). When 
Xi — 0, only the P^ and P$s L terms remain and redshift space distortions 
significantly enhance the power in line of sight modes, as usual. This remains 
true when Xi is small, because the enhancement from the velocity field is 
of course independent of reionization. However, the P xx term is isotropic, so 
once it begins to dominate, the total power becomes nearly independent of 
the underlying velocity field (and hence of //). Only on scales smaller than 
the characteristic bubble size does the power spectrum remain anisotropic, 
partly because the four-point terms can be large in this regime [278]. Thus, 
once reionization gathers steam, redshift space distortions will be difficult to 
separate and to interpret cleanly; they can probably only be used to measure 
cosmological parameters before reionization. 

Another source of anisotropy is evolution in the density field and neutral gas 
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distribution across the observational bandwidth. When P 2 i oc P$s (i.e., when Xi 
is small or on scales much larger than the bubbles), this only affects the overall 
amplitude of the power spectrum and can be modeled reasonably well. Only 
in the late stages of reionization, and only if the remaining voids are ionized 
rapidly, do these "light-cone" effects introduce measurable anisotropics that 
dominate those of the velocity field [278,288]. 



8. 3. 3 Nongaussianity 

To this point, we have focused exclusively on the power spectrum. To many 
readers familiar with the CMB and galaxy surveys, this will seem natural: in 
those cases, the power spectrum provides a nearly perfect statistical descrip- 
tion, because the relevant fields are (to an excellent approximation) gaussian. 
Of course, for the 21 cm signal during reionization, this gaussian assumption 
breaks down. A glance at Figures 16 or 17 shows immediately that the fluctua- 
tions are dominated by the striking contrast between the neutral IGM and the 
HII regions; the (gaussian) density fluctuations in the residual neutral gas play 
only a secondary role. Only on scales so large that P 2 i oc Pgs does the gaus- 
sian approximation hold [371]. Exploring these signatures of nongaussianity is 
crucial for extracting the maximal information from upcoming experiments, 
especially because it helps to distinguish the cosmological signal from the fore- 
grounds. Unfortunately, it is also difficult and so has received relatively little 
attention. 

The obvious place to begin is with the probability distribution function (PDF) 
of pixel fluxes considered as a function of smoothing scale. Analytic models 
show that, because of the correlation between Xi and 5, this will have in- 
teresting and nontrivial structure during reionization [370]. Figure 23 shows 
several examples taken from a simulation of reionization [205] . The maps have 
been smoothed over 20, 10, and 5/i _1 Mpc; the Figure also shows the best 
fit gaussians. When Xi = 0.5, the PDF has a cutoff at large 5Tb and an ex- 
cess at negative 5Tb (from HII regions). The cutoff is caused by "inside-out" 
reionization, in which sources ionize their (dense) surroundings first. It is a 
direct result of the ionization criterion C/coii > 1; if voids were instead ion- 
ized first, the PDF would look much different, even if the power spectra were 
comparable [370]. Note also that the distribution narrows as the smoothing 
scale increases, because bubbles are no longer resolved. Later in reionization, 
at Xi = 0.75, the PDF has a strong peak at negative 5T b and a long tail to- 
ward higher fluxes. This occurs because, by this point, the simulation contains 
large, connected HII regions with discrete islands of neutral gas. Because these 
islands still have high contrast, they may be the easiest features to detect at 
the end of reionization [205]. 

The simplest way to approach nongaussianity is to try to measure this PDF 
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Fig. 23. Pixel flux PDFs in a simulation of reionization. The left and right panels 
have Xi = 0.5 and 0.75, respectively. Within each panel, the maps are smoothed on 
scales 20/i -1 , lO/i -1 , and 5ft," 1 Mpc (solid curves; narrowest to widest). The dotted 
curves show the corresponding best fit gaussians. From [205]. 

directly. In particular, if the bubbles are resolved, the distribution will have 
two peaks: one centered on 5Tb = for pixels entirely inside HII regions 
and one centered on 5Tb ~ 20 mK. Thus, even factoring in noise near to 
or larger than this separation, the underlying distribution will be bimodal. 
Sensitive tests for bimodality in large datasets are available and relatively 
easy to implement [279]. These techniques essentially compare the goodness 
of fit of single and multiple gaussians to the data; they succeed because of the 
enormous volume subtended by the fields of view of even the first generation 
of telescopes. Numerical simulations show that this simple idea can work in 
realistic datasets (although foreground contamination may be a problem). 

Of course, the PDF throws away all the information contained in spatial cor- 
relations so it is certainly not optimal. A complementary approach is to study 
"global" properties of the ionization pattern, such as the genus number [372]. 
This is particularly useful in identifying the topological transition from dis- 
crete HII regions to discrete HI clouds. 

We expect the continued development of strategies to detect the signatures of 
HII regions to be a major step toward reliable 21 cm data analysis algorithms. 
It is imperative that we identify robust and powerful statistics to extract as 
much information as possible from the data. A particularly important appli- 
cation is to develop tests for "bubble" structure or sharp edges in the data; 
statistical detection of such features, through methods like the bimodality test 
of [279] , will vastly increase the believability of the results, because foreground 
contamination and removal are unlikely to introduce sharp edges like these (at 
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least in the frequency direction, where all sources have smooth spectra). 



8.4 The Imaging Regime 

The statistical measures described in the previous section will shed a great 
deal of light on the reionization epoch. However, even more exciting is direct 
three-dimensional tomography of the high-redshift IGM (e.g., [273,373,374]). 
The 21 cm line is by far the best way to perform such tomography: the CMB 
is a snapshot of a two-dimensional surface, and quasar absorption spectra only 
sample the IGM in sparse one- dimensional skewers. The low S/N of upcom- 
ing 21 cm experiments precludes direct imaging of the cosmic web in the near 
term (see §9.1.2). However, the large HII regions blown by quasars or clustered 
galaxies will appear as holes in 21 cm emission and will have sufficiently high 
S/N to be detected individually. They may very well be the first unambiguous 
features of the epoch of reionization detected by these experiments. We there- 
fore focus here on the rich science returns from direct imaging of high-redshift 
HII regions. 

Detecting HII regions in the face of telescope noise requires that they be suf- 
ficiently large and that the contrast between neutral and ionized regions be 
sufficiently high. As we will see in §9.1.2, the brightness temperature sensi- 
tivity decreases dramatically as the physical scale to be probed decreases (eq. 
141). It should be compared to the temperature contrast 5T b xs 22rr H i(l + 
S)[(l + z)/7.5)] l l 2 mK (assuming T s 3> T 7 ) between neutral and ionized re- 
gions (note that this contrast falls if the IGM is partially ionized from X-rays 
or incomplete recombination in a fossil HII region). The angular and frequency 
scales required for a ~ 5a detection correspond to ~ 20h~ 1 Mpc with the first 
generation of experiments. To prevent beam dilution across the bubble, the 
beam size should be somewhat smaller than the bubble radius; as with power 
spectrum measurements [158], the optimal S/N generally results from match- 
ing the frequency resolution to the beam size: reducing the channel width 
when the bubble is unresolved yields no gains (because of beam dilution) but 
increases the noise level. 

The need for such large bubbles pushes us toward either those blown by bright 
quasars [273, 373-375] or those characteristic of clustered galaxies late in the 
reionization process [341]. The former are particularly attractive, since a num- 
ber of z > 6 SDSS quasars with large proximity zones are already known [7]; 
if the IGM is significantly neutral x m > 0.2 at z ~ 6.0-6.5, their HII regions 
should be detectable and will likely be the the first imaging targets of the 
new generation of 21 cm experiments. In this context, the 21 cm fluctuations 
discussed in the last several sections (including the linear density field, the 
spin temperature, and unresolved bubbles) become noise [158,373,374]. This 
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is analogous to confusion noise in ordinary imaging and provides an irreducible 
background in that additional integration does not reduce the effective noise 
level (except the contribution from bubbles, in so far as smaller bubbles can 
be identified). Thus, the optimal strategy is to integrate until the telescope 
noise reaches the same level as that due to the cosmic web, which is typically 
a few mK on the scales of interest. This limit will probably only be reached 
with SKA-class instruments. 37 

Given that we are looking for bright quasars, one might expect features to be 
extremely rare. However, 21 cm experiments have broad bandpasses and wide 
fields of view encompassing huge volumes. For instance, a simple extrapolation 
of the empirical quasar luminosity function at lower redshifts yields at least 
one quasar with L B > 2 x 1O 1O L within the ~ 500, 000(/i _1 Mpc) 3 volume 
subtended by a single 10' synthesized beam from z = 6-12 [374]; such an 
object will blow an HII region of sufficient size to overcome the cosmic web 
variance. Similar estimates based on a semi-analytic model for the quasar 
luminosity function [373,375,376] (and also from a different extrapolation of 
the empirical luminosity function [6]), predict that HII regions should appear 
in each MWA-5000 38 pointing with a field of view of 400 deg 2 in a bandpass 
of 16 MHz. They emphasize that the short duty-cycle of quasars (~ 0.01 at 
these redshifts) implies that the abundance of fossil HII regions - where the 
quasar has turned off but where (due to the long recombination time) the HII 
cavity remains - may be up to two orders of magnitude larger. They estimate 
~ 1 active/fossil quasar HII region with R > (24,40) Mpc, R > (18,36) Mpc, 
and R > (11, 22) Mpc (in comoving units) at z — 7, 8, and 10, respectively, in 
a single MWA-5000 field of view [373]. 

What is the best observational strategy to pick out these HII regions? Statis- 
tical detection of boundaries in a noisy field is a classic problem with many 
applications from oceanography to medical imaging. The optimal strategy for 
21 cm surveys is still unclear. One possibility is to throw away spatial informa- 
tion and simply examine the PDF of pixel temperatures [279] . However, if the 
bubbles are large and the S/N per bubble is high, more straightforward tech- 
niques are applicable. For experiments operating solely in the frequency do- 
main, both the equivalent width and depth of spectral dips sourced by quasar 
HII regions are possible indicators [374] . The equivalent width depends on the 
lifetime of the quasar; for short-lived objects, it cannot be distinguished from 
the tail of Gaussian fluctuations. However, the depth is more robust since the 
gas is always highly ionized regardless of the HII region size. The optimal 
strategy for detecting dips depends on their distribution along a line of sight, 



See §9 below for a detailed discussion of the instruments mentioned in this section. 
38 The MWA-5000 is a proposed larger version of the MWA described in §9, increas- 
ing the collecting area by about an order of magnitude. See [278,373] for details on 
its capabilities. 
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[cL/V/d(LOS)](> d min ) oc (where d m i n is the limiting detectable depth in 
a given observation). Since d m - m oc t -1 / 2 , the total number of detected dips 
N oc nt a / 2 oc t"/ 2 " 1 (where n is the number of sightlines, t is the time each is 
observed, and we have assumed nt =constant). If a > 2, it is more efficient to 
integrate longer than to increase the number of lines of sight. Figure 24 shows 
some simulated three-dimensional cubes illustrating how quasar HII regions 
can be detected (from [373]). The MWA should be able to detect HII regions 
around the most luminous quasars; the follow-up MWA-5000 should be capa- 
ble of mapping the detailed geometry of HII regions (which can be complex if 
quasar emission is beamed), while the SKA can detect narrow spectral features 
and may measure the sharpness of the HII region boundary. 

What could we learn by imaging quasar HII regions? Some of the exciting 
possibilities are summarized in [373]. Most importantly, the contrast between 
HII regions and the IGM yields x H i in a more reliable way than trying to fit for 
STb(z) since the HII regions (within which only the radio foregrounds appear, 
provided the mass fraction of Lyman limit systems inside the HII regions is 
small) calibrate the foregrounds as a function of frequency. Inferences about 
high-redshift quasars are also possible, since full 3D data will be available 
(unlike with the Lya forest). Measurement of the shape of the HII region 
could reveal the emission profile of the quasar (as in the conical profiles of 
Fig. 24). The HII region sizes jointly constrain the quasar lifetime £qso and xbi- 
Asymmetry in the transverse and line of sight sizes (with additional constraints 
on the latter from the Lya forest) can arise due to finite time travel effects. 21 
cm and Lya forest measurements sample different epochs in the evolution of 
the HII regions; photons along the transverse direction were emitted earlier, 
when the HII region was smaller. The asymmetry depends on the expansion 
speed of the HII region, which for a given ionizing flux is proportional to x m . 
Thus it could potentially break the degeneracy between £ H i and £qso; directly 
constraining x H i (with only a small dependence on quasar beaming) [375]. 
These possibilities are of course subject to caveats about radiative transfer 
through a filamentary IGM - which also breaks spherical symmetry - but at 
the least a statistical detection might be possible. Finally, near-IR followup of 
large (R > 20 Mpc) HII regions, which should be dominated by quasars, may 
be an efficient means of discovering high-redshift quasars, since only ~ 100 
pointings would be needed to find an active quasar - a vast improvement 
over blind searches. Such a survey would also measure the duty cycle (and 
lifetime) of quasars, provided that the largest bubbles are indeed sourced by 
bright quasars rather than clustered galaxies. These followup surveys should 
not be too difficult; quasars driving HII regions with R > 20 Mpc should be 
detectable with 8-meter class telescopes in 200-300 s. The next generation of 
large telescopes will allow us to probe further down the luminosity function 
and to search for bright galaxies. 

One final possibility for high resolution instruments such as the SKA is to 
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Fig. 24. Simulated maps of HII regions around z = 6.5 quasars with the MWA-5000, 
assuming a 3' Gaussian beam top-hat smoothed at 0.5 MHz in a 1000 hr integration. 
The first row shows an intrinsically spherical HII region. The second, third and 
fourth rows show intrinsically bi-conical HII regions oriented along, at 7r/4 to, and 
perpendicular to the line of sight, respectively (with opening angles of 7r/3). The 
three maps show different slices of the frequency cube. The contours correspond to 
5, 11, and 17 mK. The solid black dot at (-20,-20) shows the beam size. The dashed, 
solid, and dotted spectra in the right-most panel correspond to the left, center, and 
right lines of sight marked in the third frequency slice, while the dotted and dashed 
vertical lines show the locations of the quasar and of the front of the HII region. 
(Note that distances are in physical units here.) From [373]. 



measure the structure of the ionization front and emission/absorption shells 
around a quasar. The thickness of the ionization front (which depends on 
the hardness of the ionizing source spectrum) could be used to discriminate 
between quasars and stars as ionizing sources [377] . This requires that the HII 
region sit in a neutral IGM; partial ionization by X-rays or relic HII regions 
significantly reduce the power of this discriminant. In addition to the ionization 
front, isolated quasars also have non-trivial temperature structure [273]: the 
HII region would then be surrounded by a ring of 21 cm emission (where X- 
rays from the quasar heat the IGM) fading into 21 cm absorption (where the 
X-rays peter out and Tk < T 7 ). However, this structure will disappear if X- 
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rays from other sources have heated the IGM on a more global level (as seems 
likely by z ~ 6.5; see §3). If it does exist, the detection of absorption will 
constrain the background UV flux of Lya photons, because the Lya photons 
from the quasar itself are unable to couple the spin and kinetic temperatures 
(largely because the HII region expands so fast) [375]. 



9 Low-Frequency Radio Observations 

In the past several chapters, we have described the power of 21 cm tomography 
for studying basic cosmology, high-redshift structure formation, and reioniza- 
tion. A number of experiments are currently being built to explore this signal; 
we list some of the largest in Table 6. Observing fluctuations during and be- 
fore reionization requires low-frequency telescopes (the observed frequency is 
v = 1420/[l + z] MHz) with baselines of order a kilometer (in order to achieve 
sufficiently good angular resolution). Thus all the experiments we will dis- 
cuss are interferometer arrays. The 21 Centimeter Array (21CMA, formerly 
known as PAST) is located in the Xinjiang province of western China and is 
currently being commissioned [378]. The Low Frequency Array 39 (LOFAR) is 
under construction in the Netherlands, with plans for a possible extension into 
the rest of Europe. LOFAR is a general purpose low-frequency radio telescope 
that will have both long and short baselines; it is the largest (both in terms of 
baselines and collecting area) of the first generation instruments. The compact 
core, which is most relevant to 21 cm studies, is scheduled for completion in 
2008. The Mileura Widefield Array Low Frequency Demonstrator 40 (which we 
will refer to as simply the MWA) is a smaller telescope to be built in western 
Australia, specializing in observations of the 21 cm background, radio tran- 
sients, and the heliosphere. It should also begin observations in two to three 
years. Finally, the Square Kilometer Array 41 (SKA) is the next-generation 
multi-purpose radio telescope, aiming for completion toward the end of the 
next decade. It is sufficiently early in the design process that its parameters 
in Table 6 should be considered educated guesses at best. Not included in Ta- 
ble 6 is the Precision Array to Probe Epoch of Reionization (PAPER), which 
aims to study the observational challenges, and strategies to overcome them, 
in detail as sensitivity is slowly increased. Thus its parameters are sufficiently 
dynamic that a summary as in Table 6 is not useful. An 8-dipole array has 
already been deployed in Green Bank, West Virginia, and plans call for later 
expansion to Western Australia (D. Backer, private communication). 

The purpose of this section is to review these challenges and to ground the 

39 See www.lofar.org for more information. 

40 See web.haystack.mit.edu/arrays/MWA/ for more information. 

41 See www.skatelescope.org for more information. 
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Table 6 



Existing and planned low-frequency radio telescopes and their approximate param- 
eters. The second column is the number of antenna elements. The third shows the 
total collecting area A tot at z = 8. We also quote the field of view (FoV) at z = 8; 
it scales approximately with the diffraction limit O 2 -, (or with [1 + z] 2 ). Anin is the 
minimum baseline (in most cases determined by the area of a single element). Anax 
is the maximum baseline; for experiments labeled with "c" this is actually the ex- 
tent of the compact core (which is the only part included in this table) . Parameters 
are taken from [379,380] and the experiments' websites; all are approximate. 

great promise of 21 cm studies in the practical considerations of the real world. 
We begin with a brief discussion of low-frequency radio astronomy in §9.1. We 
then describe how to estimate the sensitivity for statistical measurements in 
§9.2. Finally, we end with discussions of foreground cleaning (§9.3) and some 
other systematic difficulties (§9.4). 

9. 1 Overview 

Low frequency radio telescopes operate as phase coherent apertures. The 
"diffraction limit" dictates that the finest achievable angular resolution 6 D 
depends on the largest dimension D max of the telescope, 6*d ~ A/D max ; in our 
case, we have 9 D ~ A o (l + 20/Anax, where A = 21.1 cm. The exact value of 8 D 
depends on the details of the aperture illumination, whether the telescope is 
a single dish or an array of sub-apertures, and the procedures invoked during 
calibration and image processing; a useful estimate for the redshifted 21 cm 
line is Q D ~ 1.2° [(1 + z)/10] (D max /100 m)" 1 . 

Telescope mirrors must be smooth to a fraction of a wavelength in order to 
achieve diffraction-limited operation. In the v < 200 MHz regime relevant 
for us, A > 1.5 m, and this smoothness requirement is lax. This permits the 
use of coarse mesh as a reflecting surface. Furthermore, simple, inexpensive 
technologies, such as yagi antennae 42 and dipole arrays, can be used in tele- 

42 Yagi or Yagi-Uda antennae (named after their inventors) are the most common 
type of roof-mounted antenna for receiving television signals. They consist of a 
driven dipole mounted between parallel elements (rods) that form a reflector behind 
the dipole and a series of directors, which serve to increase the antenna gain in rough 
proportion to the number of director elements. 
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scope designs, making it possible to build the required large collecting areas 
at affordable costs. 



A further consequence of diffraction-limited operation is that the sensitivity to 
a distant radio source whose angular size is less than D results from the phase 
coherent sum of the source's incident electric field over the entire telescope col- 
lecting area. In this regime, the detectability of compact radio sources is nearly 
independent of the telescope aperture distribution and can be estimated quite 
simply from the total telescope area. But the surface brightness sensitivity to 
radio sources resolved by the telescope (9s > #d) is highly dependent on the 
design of the telescope, as we discuss in § 9.1.2. 

The sensitivity of a telescope system depends on the competition between the 
strength of the celestial signal collected by the antenna and the noise, which 
can have different origins. In many radio astronomy applications, the dominant 
noise contribution arises in the first amplifiers that are connected to the output 
of the antenna. The electrical junction between the antenna and this first 
amplifier of the receiving system is a convenient place to compare the celestial 
signal strength to the receiver noise level. The signal output of the antenna 
can be specified as an antenna temperature, T a , which is the temperature 
of a matched resistive load that would produce the same power level (P a = 
/ceT a Az/ for the resistor) as the signal power P = A e S u Au/2 received in 
one of two orthogonal antenna polarizations [381]. Here S u is the source flux 
density (assuming an unpolarized source), Az/ is the observed bandwidth, and 
A e is the effective collecting area of the antenna. These equations define the 
antenna sensitivity factor K a = T a / S u = A e /2kB in units of K Jy -1 . 43 

The signal-to- noise ratio is assessed by comparing T a and T sys , the system 
temperature, similarly defined as the temperature of a matched resistor in- 
put to an ideal noise-free receiver that produces the same noise power level 
as measured at the output of the actual receiver. Noise fluctuations AT N 
decline with increased bandwidth and integration time t; n t according to the 
well-known radiometer equation (e.g. [382]): 

AT* = k c ^= - (135) 

VAZ/ tint V A// tint 



where k c > 1 is a loss factor accounting for the details of the signal detection 
scheme. For radio spectrometers, k c depends on the number of bits of precision 

43 Some confusion surrounds the definition of K a , because many early radio as- 
tronomy systems received only a single polarization. Then, a K' a was defined as 
A e /kB, with the implicit assumption of unpolarized sources. This definition per- 
sists in many applications, and observers compensate by averaging (rather than 
summing) the measurements from dual polarized systems to obtain the total flux 
density. 
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Fig. 25. Brightness temperature of the radio sky at 150 MHz (from [383]) in Galactic 
coordinates. Contours are drawn at 180 (dashed), 270, 360, 540, 1100, 2200, 3300, 
4400, and 5500 K. The 21CMA/PAST [378] survey field at the North celestial pole 
is cross-hatched. Heavy lines indicate constant declinations:— 26.5°, +35°, and +54° 
with dots to mark 2 hour intervals of time. Star symbols indicate the coordinates of 
the 4 highest redshift (z > 6.2) SDSS quasars (found with the NASA Extragalactic 
Database, nedwww.ipac.caltech.edu) . 

used to quantify the signals when they are digitized. For interferometers, there 
may be loss of signal when compensation is made for rapid fringe rotation. 
For our purposes, this instrumental parameter will be close to unity, and we 
will set k c — 1 throughout the remainder of this discussion. 



9.1.1 The Radio Sky 

Pointing the beam of the radio telescope into a region of sky whose brightness 
temperature is T sky has the effect of adding noise power to that already present 
in the receiver, so that T sys > T sky . In fact, at the low radio frequencies relevant 
to the redshifted 21 cm line, the sky is so bright that T sys pa T sky . Figure 25 
maps the radio sky at 150 MHz, corresponding to z — 8.5 for the 21 cm line. 
Everywhere on the sky, T sky is dominated by synchrotron radiation from fast 
electrons in the Milky Way; we will discuss the characteristics of this emission 
in more detail in §9.3. At these frequencies the Galactic Center is the strongest 
feature, and the brightness declines rapidly with Galactic latitude, making the 
Galactic poles attractive survey targets. Also of significance are two minima 
in the sky brightness located in the general direction of the anti-center, where 
the sky brightness reaches ~150 K. A rule of thumb for typical high-latitude, 
"quiet" portions of the sky is 



T skv ~ 180 K. (136) 

sky U80MHz/ V ; 
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The challenge at the heart of these observations is immediately apparent: the 
foregrounds are four to five orders of magnitude larger than the expected 
signal, so substantial collecting areas and/or long integrations will be required 
to separate the cosmo logical component, even ignoring systematic effects! Also 
note the rapid increase toward lower frequencies; for this reason, we expect the 
lowest redshifts (corresponding to reionization) to be by far the most accessible 
to observations. 

To minimize T sys , experiments must therefore choose fields where the sky 
brightness is small. Figure 25 indicates the location of the 21CMA/PAST [378] 
field, fixed by the telescope design to point at the north celestial pole. Both 
LOFAR and MWA have steerable beams, which allows the selection of low sky 
brightness survey fields. Solid lines in Figure 25 trace the paths of the zenith 
through the course of a single day (at declinations +54° for LOFAR and at 
-26.5° for MWA). LOFAR fields chosen at declinations around +35° would 
lie in the heart of the northern cool spot. MWA will be able to observe fields 
in both the northern and southern cool zones. 

At low radio frequencies, non-thermal radiation from the Sun becomes ex- 
tremely bright and variable during periods of high solar activity. To minimize 
solar contamination, redshifted 21 cm observations must be made at night. 
Together with the restriction that cold regions of sky be accessible, 21 cm 
studies will have limited operating "seasons." For example, over a calendar 
year, any single field will only be accessible for ~ 1000 hours with MWA. 



9.1.2 Telescope Sensitivity: Imaging 

Computing the sensitivity of compact sources unresolved by the diffraction- 
limited beam 6d of the array is straightforward. The noise level (in flux density 
units) can be written 

T S ys/ Kg MQ7^ 

" s = TstS' (137) 



where K a now includes the total effective collecting area of the telescope. As 
a concrete example, the flux density corresponding to 5Tb ~ 10 mK across a 
25' HI cloud (or hole) at z = 8.5 (spanning a comoving diameter « 65 Mpc) 
is ~270 /xJy In theory, a telescope like Arecibo (K a ~ 10 K Jy _1 ) observing a 
quiet patch of sky with 1 MHz bandwidth and dual polarizations could detect 
such a hole in t int ~ 35 hours. 

There is often confusion about the sensitivity of radio telescopes to low surface 
brightness features, especially when the telescope is a sparse "array" of small 
apertures. Such a configuration is called an "unfilled aperture." One approach 
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to computing this important quantity uses the relation between brightness 
temperature and flux density from §2.1. We obtain an equivalent brightness 
temperature uncertainty 



ATN = (138) 



where the telescope beam subtends a solid angle Q# ~ #d if the telescope 
collecting area is distributed over an area of diameter D. Combined with the 
definitions for as, K a and #d, equation (138) reduces to 

AT N = ( Dmax ) ~^ sys = ~^ sys (139) 

\ Aot / \J^V tint rify/Av ti nt ' 



where A tot is total effective area and r/f = A to t/-Dmax is the arra?/ filling fac- 
tor. 44 An appreciation of this dependence on rjf is crucial: the integration 
time required to detect a given surface brightness grows as t hlt oc -D max if 
the (fixed) total collecting area is spread over larger areas in order to achieve 
better angular resolution. 

We can develop more intuition about the telescope response through a thought 
experiment in which a radio telescope is encased in a blackbody of temperature 
T. Regardless of its size, and with proper impedance matching, the telescope 
would produce an antenna temperature T a = T at its output. For this reason, 
attempts to observe the global 21 cm background are more concerned with 
issues of matching and gain calibration than with antenna size (see §3.6). 

On the other hand, a telescope constructed with a beam of solid angle Qb will 
still deliver T a = T at its output if (i) it is embedded in a black body radiation 
field or (ii) an emitter of T B = T entirely fills its beam. Unfortunately, real 
radio telescopes do not form perfectly defined beams, and all suffer from side- 
lobes whose shapes and responses are dictated by diffraction and scattering of 
the incident radiation through the telescope. This is especially true of arrays, 
where a fraction (1 — rjf) of the total response lies outside the beam defined 

by 6 D rsj A/Anax- 

Using equation (136) with T sys ps T s k y to estimate the telescope noise AT N for 
a single-dish measurement of an unresolved source, we find 

» v, i /I + z\ 2S /MHz 100 hr\ 1/2 , x 

AT"U~0.6e-> mK (— ) (- sr — ) , (140) 

44 Note that, because it is the ratio of the effective area to the physical size, r\f need 
not be unity even for a single dish. 
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where we have replaced rjf with the aperture efficiency e ap , the ratio of effec- 
tive and physical areas for a single dish. Recall that the mean 21 cm signal 
has 5T b ~ 10 mK; thus single dish telescopes can easily reach the sensitivity 
necessary to detect the global 21 cm background. Instead, the challenge is 
to separate the (relatively slowly varying) cosmological signal from the fore- 
grounds through careful gain calibration (see §3.6). This may be possible if 
the 21 cm background has strong features (such as an absorption dip from 
early Wouthuysen-Field coupling or a sharp break from rapid reionization). 

Of course, given the limited resolution of single-dish experiments at these 
low frequencies, interferometry is required to make maps with even relatively 
coarse resolution; for realistic collecting areas, the array dilution factor r) f dra- 
matically decreases the sensitivity. Again using equation (136) for the system 
temperature, we find 

a rpN I o v ( A ^ \ ( 10 '\ 2 /1 + ^ 4 - 6 /MHz 100hr\ 1/2 

From equations (3) and (4), these angular and frequency scales correspond 
to ~ 20h~ 1 Mpc: clearly, at least for the first generation of interferometers, 
imaging will only be possible on coarse scales that exceed the typical sizes 
of bubbles during most of reionization. It is for this reason that near-term 
imaging experiments focus primarily on large quasar HII regions. 

To this point, we have used the conventional radio astronomy approach and 
explicitly separated the angular and frequency dimensions. In most applica- 
tions, these are physically different: the former correspond to physical dis- 
tances while the latter provides spectral information from each source. Our 
application is different in that all three are equivalent: because of the cos- 
mological redshift, the frequency dimension also corresponds to a distance 
(albeit in redshift space). Thus it is usually better to think in terms of three- 
dimensional volumetric cells rather than pixels on the sky, or (for the power 
spectrum) the Fourier transform of this representation. 

In detail, the effective collecting area at a given angular resolution actually 
depends on the distribution of baselines in the array, so careful attention must 
be paid to antenna placement during array design. Figure 26 illustrates this 
for some realistic array configurations as specified in Table 6. It shows the 
fraction of Fourier-space "pixels" (each corresponding to a particular baseline 
and frequency pair; see below for a more precise definition) for which the rms 
signal exceeds the rms noise in a 1000 hour observation at z = 8 with the MWA 
(dashed curve), LOFAR (dot-dashed curve), and the SKA (solid curve); note 
that these pixels can be identified a priori because they correspond to base- 
lines with long effective integration times (and hence low noise). The vertical 
dotted line shows the scale corresponding to a 6 MHz bandwidth observation; 
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Fig. 26. Fraction of three-dimensional Fourier-space "pixels" with signal-to-noise 
greater than unity in several upcoming 21 cm experiments: the MWA (dashed), 
LOFAR (dash-dotted), and the SKA (solid). Left: Scale-dependent signal originating 
from pure density fluctuations. Right: Scale-independent 12 mK signal (half the 
contrast between fully neutral and fully ionized gas). We have assumed a 1000 
hour integration at z = 8. In each panel, the vertical dotted line marks the scale 
corresponding to 6 MHz; smaller-A; modes will likely be overwhelmed by residual 
foregrounds. From [278]. 



as we will see in §9.3, larger scales are essentially removed by foreground- 
cleaning. Thus only wavenumbers to the right of this line are really relevant 
to observations. 



The two panels make different estimates for the signal. In the left plot, we 
let it equal the (scale-dependent) rms brightness temperature fluctuation if 
only density fluctuations contribute. Because reionization tends to enhance 
the fluctuations, this is somewhat pessimistic. In the other plot, we let the 
signal equal 12 mK regardless of scale: this is one-half of the contrast between 
completely neutral and completely ionized gas. This plot therefore represents 
the ability to image isolated HII regions at each scale, essentially providing a 
"best case" scenario for imaging during reionization. Clearly, the prospects for 
imaging with LOFAR or the MWA are dim at best: only a small fraction of 
modes will be well-resolved. LOFAR fares considerably better because of its 
larger collecting area [384] - and is even reasonably efficient at detecting HII 
regions with k < 0.1 Mpc -1 - but note that it also has a significantly smaller 
field of view. Thus the total number of imaged modes per field is actually 
comparable in the two experiments. The SKA, on the other hand, has a large 
enough collecting area to image all modes up to k = 0.3-0.5 Mpc -1 , with a 
tail extending to higher wavenumbers. Still, these scales correspond to radii 
of ~ 5 Mpc, so detecting small HII regions will always be difficult. 
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9.1.3 Telescope Response Patterns 

We now briefly discuss antenna response patterns in order to prepare for our 
later assessment of systematic errors and astrophysical foregrounds. 

The left panel of Figure 27 sketches the basic antenna station that the MWA 
and LOFAR designs use as a fundamental element of the interferometer arrays. 
This "tile" is composed of 16 dipoles positioned in a 4x4 grid on a conducting 
ground plane. The dipoles shown in Figure 27 are aligned parallel to the ground 
plane. For optimum gain near zenith at observed wavelength A, they should 
be positioned on a grid with A/2 spacing at A/4 above the ground plane. 

The effective collecting area of a dipole scales as A 2 in applications where the 
linear size of the dipole is adjusted in proportion to wavelength. However, in 
the tile design, a fixed size for the elements and fixed spacing must serve a 
range of wavelengths in order to observe a range of redshifts. Under these cir- 
cumstances, the elements are operated as "active short dipoles" [385] in which 
preamplifiers are built into the hubs of the dipoles to assist in impedance 
matching, and the shortness of the dipoles causes their effective collecting 
areas to drop toward long wavelengths. Despite this apparent loss of effec- 
tiveness, the noise power from the sky brightness continues to dominate T sys , 
and some benefits accompany the reduced effective collecting area toward long 
wavelengths, since it carries with it a weaker coupling of the fields between 
the elements and a reduced cross section for mutual shadowing. 

The prototype designs have a second set of orthogonal, north-south dipoles 
co-located with the east-west ones in order to receive both polarizations si- 
multaneously. The optimized designs from extensive computer modeling are 
generally complex structures that look little like the conventional picture of 
the linear dipoles sketched in Figure 27. 

The right panel of Figure 27 shows the simulated response pattern of a simple 
4x4 tile, assuming that it is phased to receive radiation from azimuth angle 30° 
and zenith angle 30° (gray scale image with contours). 45 This response pattern 
is known as the primary beam of the antenna and will be denoted W u (h , n). 
Here n is a unit vector denoting an arbitrary direction in the sky, and no 
is the direction for which the tile is phased to produce maximum response. 
Note the presence of "sidelobes" far from the nominal pointing direction; these 
pose a significant problem for experiments. The sine projection onto the plane 
tangent to the celestial sphere at zenith has the useful property that nulls 
between the main beam and sidelobes retain a rectangular grid pattern. The 
depth of the nulls in real telescopes is unlikely to equal those indicated in this 

45 Although dipole tiles lie flat on the ground and are not physically steerable, 
properly phasing the elements allows us to point the combined beam by introducing 
electrical delays in the signal paths from the dipoles. 
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Fig. 27. "4x4 tile" and its response pattern. Left: Sketch of a "tile" with 16 horizon- 
tal dipoles arrayed in a 4x4 grid, spaced at d ~ A/2. For optimum gain at zenith, 
the dipoles must be positioned at A/4 above the conducting ground plane. Right: 
Response of the tile, projected on the sky, for phasing that points toward an azimuth 
angle Az=30° and zenith angle ZA=30°. Radial distance is oc sin(ZA). Contours 
lie at —3 (dashed), —10, —20, —30, and —40 decibels relative to maximum. 

figure, because slight variations in gain among the dipoles and quantization 
in the delays made to steer the tile beam imply that adding the 16 signals to 
form the tile beam will probably introduce errors greater than the 10~ 4 level 
depicted in Figure 27. 

An important characteristic of the tile design (true, in fact, for all diffraction- 
limited telescopes) is that the sidelobe pattern scales with A. Thus the nulls 
systematically move across the sky as a function of frequency, even if the center 
of the beam remains phased to point at specific coordinates independently of 
frequency. This chromatic behavior is a serious concern in the discussion of 
astrophysical foregrounds, because the interferometer array detects sources all 
across the sky through the frequency-dependent sidelobes. Thus as the earth 
turns and the maximum of the response pattern tracks the chosen celestial 
coordinates, the reception pattern projected on the sky rotates and distorts, 
causing the signals received from sources lying in the sidelobes to vary in 
strength. 



9.1.4 Polarization Properties of Crossed Dipole Antennae 

These 21 cm line observations will require extraordinarily precise instrumen- 
tal calibration because of the high dynamic range necessary to identify mK 
structures in the bright radio sky (see eq. 136), with the additional compli- 
cation of bright discrete radio sources. While they are relatively inexpensive, 
dipole antennae require unconventional calibration methods. As we will see, a 
particular problem is instrumental coupling between polarized flux and total 
intensity. 
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Fig. 28. Schematic view of crossed dipole antennae above a ground plane for four 
viewing angles, varying from face-on at zenith angle ZA = 0° to ZA = 90°. 



The calibration problem is made difficult by the varying polarization response 
of crossed dipoles with angle on the sky, as illustrated by Figure 28. Dipoles 
that receive orthogonal polarizations from sources near zenith become progres- 
sively more correlated in their reception patterns as the source's zenith angle 
increases. For sources close to the horizon, the crossed dipole antenna can re- 
ceive only a single polarization; fortunately, this limit is not a severe problem, 
because dipoles above ground planes project nulls toward the horizon. 

Although antenna tiles will not be used to track sources at zenith angles 
greater than 60°, their large sidelobes (Fig. 27) imply that flux from bright 
sources and Galactic continuum emission enter the telescope from nearly the 
entire hemisphere above the horizon. For applications such as ours where pre- 
cise intensity measurements or polarization information is desired, the an- 
tenna response must be carefully calibrated to decode the incident signals 
into Stokes parameters. The calibration of antenna tiles will take advantage of 
the well-developed "measurement equation" formalism [386-388] to describe 
this instrumental response. A single dipole above a ground plane has a readily 
calculable response with this formalism. But an array of crossed dipoles on 
a tile is complicated by shading and electrical coupling between the dipoles, 
and these effects are more difficult to quantify with the measurement equa- 
tion. Developmental studies are now in progress to implement the necessary 
degree of precision into instrumental calibration. 
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Fig. 29. Telescope response patterns. Upper row: The sine projection of the re- 
sponse pattern on the plane tangent to the sky at zenith. Lower row: The response 
pattern projected on the celestial hemisphere above the horizon. Left: Response 
of a single interferometer element (a "tile" in the case of MWA), also known as 
the primary beam. Center: Response of a two element, east-west interferometer of 
omni-directional elements with a baseline of 20A. Right: Response of an interferom- 
eter with two MWA tiles as receiving elements. 

9.1.5 Interferometer Response Patterns 

When two antennae are coupled together electronically to form an interferom- 
eter, the combined response projected on the sky resembles the characteristic 
diffraction pattern from a double slit (see Fig. 29). In general, the interferom- 
eter response to the sky brightness distribution 7„(n) is 

J dfi/,(n)E 1 (n ,fi,z/)E;(fi ,n,z/)e 2mfi - B / A , (142) 

where Ej(fio,n, v) is the complex electric field response pattern of the ith 
element. The term exp(27ri n • B/A), where B is the baseline vector from one 
interferometer element to the other, is purely geometric. With the assumption 
of identical beam patterns for all the elements, the product of the electric field 
patterns becomes the primary beam power pattern W„(no,n) introduced in 
the previous section. Radio astronomers conventionally write the response for a 
particular "visibility" V, corresponding to a particular baseline and frequency 
pair, in units of flux density 

V Jy (B, n , v) = j dQ5T b (h, u)W u (h , n) e 2mfi ' B / A . (143) 

Usually, an approximate form of equation (143) can be adopted: 

Vj y (h ,u,v,u) ~ ^/ dxdy5T b (x,y,u)W u (h ,h)e 2m ^ +v y\ (144) 

where B = A(iu , vj , wh ). In the orthogonal (u,v,w) coordinate system, 
the w axis aligns with the direction toward the sky at the center of the tile 
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beam, and the u-v axes are oriented so that the v axis projects onto the local 
meridian. The coordinates x and y are angles measured in the "sky plane" 
relative to the intersection of n with the celestial sphere. In this Fourier 
transform of the sky, u and v represent spatial frequencies and the w axis 
produces a phase offset in the interferometer fringe that can be calibrated. This 
is the standard equation for describing aperture synthesis techniques, because 
radio interferometer arrays are designed to sample Vj y (w, v,v) at many u-v 
coordinates and then reconstruct images of the sky through the inverse Fourier 
integral (e.g., [381]). For our purposes, it is convenient to define the equivalent 
response in temperature units, 

V(n , u, v, v) = A 2 V Jy (n , u, v, v)/2k B . (145) 

For most of the following discussion, we adopt the assumptions and nomen- 
clature of equation (144). However, its approximations are not always valid, 
in which case more elaborate solutions than a simple Fourier inversion are 
required. Complications arise because we must observe large areas of sky to 
build sufficient statistics and to identify interesting objects for followup at 
other wavelengths. Thus in order to keep the primary beam patterns large, 
the individual interferometer elements are intentionally kept small. There are 
several consequences. 

First, the assumption of a single spatial frequency (= \Ju 2 + v 2 ) across the 
entire primary beam does not hold for beams as broad as our example tile's 
(Fig. 27). Figure 29 shows that the fringe spacing (or "resolution" 9) of the 
interferometer varies with direction according to 9 oc \/B sin a, where a is 
the angle between B and (n — no). Thus the spatial frequency of structures 
in the sky sensed by the interferometer depends on their directions on the 
sky, and the interferometer actually senses a band of spatial frequencies rather 
than a single one. More sophisticated imaging techniques will be required than 
those commonly used in aperture synthesis applications. Extracting the power 
spectrum will of necessity be a multi-stage procedure, involving an interplay 
between image plane and u-v constraints during calibration and foreground 
cleaning. 

Second, for telescope designs relying on electronically steered tiles, slight differ- 
ences in primary beam patterns Ej(fio, n, v) will result from small imbalances 
in the gains and the delays used to sum the signals from the individual ele- 
ments. Especially with regard to removing sources outside the primary beam, 
these gain patterns will be difficult to calibrate, because every pointing direc- 
tion suffers from slightly different gain imperfections. Thus the gain variations 
from element to element void the assumptions of the simple Fourier inver- 
sion approach. In the course of a day, the rotation of the reception pattern, 
with its lobes and nulls, also defeats the assumptions adopted in conventional 
Earth-rotation aperture synthesis. 
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Third, the Fourier integral does not properly account for sources far outside the 
primary beam; in effect, these add a noise-like contribution entering through 
the sidelobes. Were it not for the variability of the primary reception pattern, 
the signals from these outlying sources would be suppressed by the system- 
atic cancellation of the aperture synthesis method. But as it is, the residuals 
can remain problematic because of the extreme brightness of the foregrounds 
relative to the redshifted 21 cm signal. 

The problem is that a source with a flat continuum couples to the field of 
interest in a frequency-dependent way. Each baseline in the array (phased to 
point in the direction n ) senses an unresolved source S in the direction n 
with response 

V Jy (B, n , v) = W„(h , h)S exp [2ni (n - n ) • Bv/c] 

= W u {h ,h)Sexp{27rtu/u c ), (146) 

where v c is the characteristic frequency scale of the chromatic contamination; 
for a field observed on the meridian by an east-west interferometer, a source 
rising in the east would yield v c = c/|B|, which for a 1 km baseline inserts 
a ripple into the spectrum with period 0.3 MHz. As the source rises higher, 
the period of the ripple in the spectrum lengthens. The superposition of sig- 
nals, with pseudo-random phases, from all over the sky, creates a noise-like 
background that is unlikely to be gaussian if it is dominated by several bright 
sources. Because methods for foreground suppression rely on spectral smooth- 
ness (see §9.3), this chromatic behavior will escape the conventional foreground 
removal. For this reason, reduction of 21 cm observations will need to begin 
with all-sky catalogs of the bright sources and perform iterative calibration 
and source subtraction. 

The strength of this "sideground" contamination can be estimated by inte- 
grating over the differential extragalactic source counts [389] , 

n(S) dS = IO^-J^ghz d^Jy arcmin" 2 , (147) 

where S^j y is the flux density in microJanskys and z/ghz is the frequency in 
GHz. For a simple estimate, assume that 47r f s ^ y steradians of sky (with / S k y = 
1/8) is viewed through —20 dB sidelobes of the tile pattern (corresponding 
to a gain g s i = 0.01). The vector responses of the sources can be combined 
as a random walk, a sg = g si [/ s k y 47r / n(S) S 2 dS] 1 ^ 2 . We consider z = 8.5 
[y = 150 MHz) and ignore the phase smearing caused by rotation (i.e., we 
integrate over a short interval of time). Since the integral diverges mildly at 
the bright end, we truncate it at 10 Jy, which implicitly assumes that all ~800 
sources of 10 Jy and brighter have been dealt with individually through self- 
calibration and subtraction. The cumulative effect of the remaining weaker 
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sources on a single baseline is a sg ~1.6 Jy. An array such as the MWA has 
N a = 500 tiles, yielding N B = N a (N a - l)/2 = 1.2 xlO 5 baselines. If this 
sideground noise from each baseline were to add incoherently for all baselines, 
then the rms fluctuation would fall to ~ 1.6 Jy/VA B = 4 mJy However, the 
baselines of an interferometer are of course calibrated to preserve systematic 
phases in a coherent sum, so the reduction in sideground fluctuations will 
actually be much larger. Moreover, the Earth's rotation also smears the signals 
from outlying sources, and the total suppression should bring the residual 
sideground contamination below the tens of //Jy signal expected from the the 
z ~ 10 Universe. Similar estimates at lower frequencies show that observing 
the dark ages themselves will be extremely difficult. However, this is an image 
processing regime that has not yet been fully explored and conquered, so for 
the moment these expectations should be viewed with some caution. 

9.2 Sensitivity: Statistical Measures 

Given the difficulty of high signal-to-noise imaging, attention has recently fo- 
cused on statistical measurements. We will now turn to estimating the sensitiv- 
ity of 21 cm experiments to the power spectrum. Error estimates for other sta- 
tistical measures must still be developed, but the basic principles are the same 
(see [390,391]). In this section, we will only consider the effects of thermal noise 
and cosmic variance, which provide a fundamental limit. Systematics (espe- 
cially foregrounds) present equally large difficulties, and the community is hard 
at work developing strategies to mitigate them (see §9.3). Over the past sev- 
eral years, the CMB community has developed expertise at both imaging and 
power spectrum estimation with interferometers, which provides an important 
launching point for describing the 21 cm signal (e.g. [392]). The theory has now 
been further developed and applied directly to the 21 cm case (different from 
the CMB primarily because it is three-dimensional) [158,278,379,393-399]. 
Note that we will work exclusively in terms of the three-dimensional power 
spectrum. Similar considerations apply to the angular power spectrum [158]. 

We begin with the complex visibility of equation (144). The detector noise for 
a single visibility measurement is closely related to equation (135) [382]. In 
the limit in which T sys m T s k y , 



where here t u is the integration time of this particular baseline 46 and A e = 

46 Typically, with these large interferometers, the uv coverage changes with the 
earth's rotation, so this is not the same as the total integration time. Note that 




(148) 
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e^pSA is the effective collecting area of each antenna element, which is equal 
to its physical area 8 A multiplied by the aperture efficiency e ap . For simplicity, 
we will set e ap = 1 in the following; its propagation through the error estimate 
depends on the particulars of each experiment (especially whether the beam 
is tapered). To mark the difference, we will use 8 A exclusively below. This 
expression follows naturally from equation (137) for the noise level measured 
in flux density, combined with the definition for K a , and the conversion from 
flux density to temperature in equation (145). 

The observed "visibility data cube" is actually a hybrid of Fourier space (u, v ) 
and redshift-space (u) coordinates and is thus inconvenient for comparing 
to theoretical models. One can either transform the visibility data to the 
sky plane to obtain the "image cube" or transform the frequency (redshift) 
coordinate to its Fourier-space equivalent in order to obtain a representation 
with spatial frequency for all three dimensions, 

8T b (u) = j dv V(u, v, v) e 2wi ^, (149) 

B 



where the integration extends over the full bandwidth B of the observation, 
u = ui + f j + r/z, and rj has dimensions of time. In this representation, the 
effective noise can be obtained by Fourier transforming the signal across the 
frequency axis [398], yielding 



at n (iA - ^ Tsys ~ Tsys v ^ 2 M W\ 

AT (U) " SAyfc ~ T^Tu 8~A8r)' (15 ° } 



In the second equality, we have set Srj equal to the inverse bandwidth. The 
factor 8A/X 2 x 8rj then represents the Fourier space resolution of the observa- 
tion (or the inverse volume sampled by the primary beam, in the appropriate 
units); note the similarity to equation (135) when written in this form. Here 
AT N (u) has units of temperature divided by time, because of the Fourier 
transform in the frequency direction. 

To estimate the statistical errors, we need the covariance matrix of the noise 
for antenna pairs at baselines Uj and Uj. Because the thermal noise errors 
are uncorrelated between measurements, this is simply a diagonal matrix with 
each element the square of equation (150). In transforming to the physical 
wavevector k, we distinguish between the component uj_ oriented along the 
sky (corresponding to kj_ = 2tiuj_/£, where £ is the co moving distance to the 
observed 21 cm screen) and the component kii along the line of sight. This is 

this is qualitatively different from CMB experiments, which work at high enough 
frequencies that the instruments can be rotated mechanically to achieve the desired 
uv coverage. 
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useful because interferometers can have arbitrarily good frequency resolution 
while the uj_ coverage is fixed by the baseline distribution. 



We define the number density of baselines that observe a given uj_ as n(u±); 
this is normalized so that its integral over the half-plane is Nb = iV a (N a — l)/2, 
the total number of baselines in the array. Two properties of n(u_|_) are worth 
emphasizing. First, because of the earth's rotation, it is azimuthally symmetric 
and only a function of u± — \u±\. Second, for a smooth antenna distribution, 
n(u±) is virtually always a decreasing function of u±. This is simple geometry: 
it is difficult to arrange the antenna distribution to have many more long 
baselines than short ones. In practice, a ring of antennae provides the flattest 
possible distribution (though that is rarely optimal; see below). We can write: 



t k « n(u ± ) t int . (151) 




As before 5 A/ A 2 SuSv is the angular component of the Fourier-space reso- 
lution. Thus [278,398] 



C N (ki, kj) = (AT N (ui)* AT N (uj] 



A 2 B T sys \ 8ij 



5 A Bt k 



(152) 



Equation (152) represents the thermal noise contribution to the covariance 
matrix; even in an ideal experiment with no systematics from foregrounds, we 
must also include errors from sample variance. This component is [278] 

C sv {k u k 3 -) = (5T b *(k i )5T b (k i )> 

nSijSTi Jd 3 u\W(u l -u)\ 2 P 21 (u) 

^ p ^MFKl s ^ (153) 

where I is the distance to the 21 cm field and A£ oc B is the line-of-sight 
depth of the observed volume in comoving units (see eq. 4). In the first line, 
the average is over baseline and frequency pairs indexed by kj and kj (or 
equivalently Uj and Uj). In the second line, W is the Fourier-transform of the 
primary beam response function, including the finite bandwidth, and is most 
naturally expressed in the "observed" units u. It typically differs from zero 
in an area 5u5v5i] m 5A/(X 2 B) and (ignoring efficiencies) integrates to unity 
over the beam. For the last line, we have assumed that u is much larger than 
the width of this response function. Then P2i(u) is constant across the beam 



141 



and can be pulled out of the integral, which becomes simply (Su Sv St]) -1 . We 
have also transformed to the more physically relevant wavenumber k, which 
introduces a factor B/(£ 2 A£). 

Equation (153) has a simple physical interpretation: it is essentially a normal- 
ization factor (6T 2 b B 2 ) multiplied by P 21 /V t , where V t ~ £ 2 A£(X 2 /5A) is the 
total volume observed by the telescope. The second factor counts the number 
of estimates available for the measurement; thus the cosmic variance decreases 
as the volume increases. 

The Fisher information matrix gives an estimate of the errors on a power 
spectrum measurement from the total covariance matrix C = C N + C sv . 
Given a vector of parameters the (i, j) element of the Fisher matrix is 
defined as (see, e.g., [400]) 



F 



d 2 ln£ 



Tr 



C 



d¥, C 



! dC 



(154) 
(155) 



where C is the log-likelihood function. For the simple case of measuring the 
binned power spectrum from the datapoints, the "parameters" are the power 
spectrum amplitudes in each of the bins, = Pat = $T b P 2 i (kj); in more 
general cases they are the parameters of a theoretical model meant to describe 
the data. The Cramer-Rao inequality states that the errors on any unbiased 
estimator of the power spectrum must satisfy 



SPatQ*) > 



1 



156) 



where N c is the number of measurements in the appropriate bin and F 1 is 
the inverse of the Fisher matrix. 

In our case, the Fisher matrix is particularly simple to use because the co- 
variance matrix is diagonal. (This will not be true for real data, because fore- 
ground cleaning and other systematic effects induce correlated residual errors, 
but it provides a rough estimate of the noise limits.) Also, is of course 
independent of the underlying power spectrum, so the errors become [278] 

5P AT (k t ) « -^ = S -^l[C N (k i ,k i ) + C 5V '(k l ,k J )]. (157) 
ViVc(k~) 



The last step is to count the number of Fourier cells in each bin, which depends 
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on the Fourier-space resolution of the instrument. Recall that P21 is not truly 
isotropic, but it is azimuthally symmetric. Thus we use Fourier cells grouped 
into annuli of constant (k, fi). Then, in the limit that kfi is much larger than 
the effective h\ resolution, 



N c (k) « 2-rrk 2 dk d/i x 



(158) 



where the last term represents the Fourier space resolution. One can also aver- 
age the measurements spherically [379,398,399]; although intuitively simpler, 
this eliminates information whenever redshift-space distortions are significant. 



9. 2. 1 Implications 

Equations (152), (153), and (157) fully specify the effects of noise in the ab- 
sence of systematic effects (which we will discuss in the next two sections). 
But to make estimates we must determine the effective observing time for 
each mode - and hence the baseline distribution n(u±) by equation (151) - as 
well as the sampling density (eq. 158 for a measurement in annuli). These two 
quantities are obviously highly dependent on the instrumental configuration. 
Thus it is useful to consider the simple thermal noise-dominated case in order 
to develop some intuition for array design. Substituting for N c in equation 
(157) and assuming C N > C sv , we find 



8P AT oc5A-V 2 B- 1 ' 2 



k 3 / 2 n(k, //) 




(159) 



where of course u± oc ky/1 — /x 2 . Here we have assumed that the power spec- 
trum is measured in bins with constant logarithmic width in k but constant 
linear width in fi. From equation (159), we can deduce a number of funda- 
mental considerations driving array design [398]. For instance, we see that 
5 Pat oc t^; this is because the power spectrum depends on the square of the 
intensity. 

A more subtle question is the improvement possible by increasing the collect- 
ing area, which could be accomplished in either of two ways. First, we can 
add antennae while holding the dish size 5A constant. Recall that n(k, //) is 
normalized to the total number of baselines N B oc N 2 : thus, adding antennae 
of a fixed size decreases the errors by the total collecting area squared. (Of 
course, the number of correlations needed also increases by the same factor, so 
this strategy has other costs.) Second, we can make each antenna larger but 
hold their total number fixed. In this case, the total number of baselines, and 
hence n(k, fx), remains constant, but SPat oc 6A~ 3 / 2 . Increasing the collecting 



143 



area in this way is not as efficient because it decreases the total field of view 
of the instrument. 

Adding bandwidth increases the sensitivity relatively slowly: 5 Pat oc B 1 / 2 , 
because it adds new volume along the line of sight without affecting the noise 
on any given measurement. Of course, one must be wary of adding too much 
bandwidth because of systematics (especially foregrounds) and possibly in- 
trinsic evolution. 

Finally, as a function of scale k, 5Pat oc k~ 3 ^ 2 n(k, The first factor comes 
from the increasing (logarithmic) volume of each annulus as k increases. But in 
realistic circumstances the sensitivity actually decreases toward smaller scales 
because of n. This is most obvious if we consider a map at a single frequency. 
In that case, high-/c modes correspond to small angular separations or large 
baselines; for a fixed collecting area the array must therefore be more dilute 
and the sensitivity per pixel decreases as in equation (141). In the (simple but 
unrealistic) case of uniform uv coverage, the error on a measurement of the 
angular power spectrum increases like 9^ for a fixed collecting area [158]. 

Fortunately, the three-dimensional nature of the true 21 cm signal moderates 
this rapid decline toward smaller scales: even a single dish can measure struc- 
ture along the line of sight on small physical scales. Mathematically, because 
n(k, fi) = n(kj_) (neglecting the slow variation of uv coverage with frequency 
across the band), each baseline can image arbitrarily large k\\, at least in prin- 
ciple. For an interferometer, this implies that short baselines still contribute to 
measuring large-A; modes. Thus, provided that they have good frequency reso- 
lution (which is anyway required for RFI mitigation), compact arrays like the 
MWA are surprisingly effective at measuring small-scale power [398] . There is 
one important caveat to this trick: if short wavelength modes are only sampled 
along the frequency axis, we can only measure modes with fi 2 m 1. Thus we 
recover little, if any, information on the \x dependence of the redshift-space 
distortions. Studying this aspect of the signal does require baselines able to 
measure the short transverse modes with fi 2 pa 0. 

The preceding discussion shows that understanding the expected errors on 
statistical measurements is crucial to both the array design and the observing 
strategy. To that end, we next review some of the main results of such esti- 
mates [158,278,379,393-399]. We emphasize again that we neglect systematic 
errors in this section. Table 6 shows the basic parameters for several upcoming 
experiments (although many of these are only educated guesses). The major 
element missing is the baseline distribution between -D m in and -D max . The fig- 
ures below assume that the baselines are distributed as r~ 2 with a compact, 
filled core. This is reasonable if N a is large; if not, the Fourier space coverage 
may be patchier. 
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Fig. 30. Isotropic power spectrum sensitivity, in logarithmic bins with A A; = k/2, for 
several experimental configurations. In each panel, the thin solid and dashed curves 
show estimates of the signal with and without reionization. The thick solid, dashed, 
and dot-dashed curves show error estimates for 1000 hour observations over 6 MHz 
with the SKA, MWA, and LOFAR, respectively. Each assumes perfect foreground 
removal. The dotted curve in the middle panel assumes a flat antenna distribution 
for the MWA. From [278]. 

Figure 30 summarizes the expected errors for the MWA (thick dashed curves), 
LOFAR (thick dot-dashed curves), and the SKA (thick solid curves) on the 
isotropic power spectrum [278]. In each panel, estimates of the signal (with 
and without reionization) are shown by the thin dashed and solid curves. The 
setup assumes 1000 hours of total integration on a single field (about a one- 
year observing campaign), 47 a 6 MHz band (corresponding to Az ~ 0.5), bins 
of width Ak = k/2, and T sys = (250, 440, 1000) K at z = (6, 8, 12). We must 
also guess how A tot scales with v (see details in [278]; for now note that we 
optimistically assume that the SKA is optimized for each of these frequencies). 

Several key points are apparent in Figure 30. First and foremost, thermal 
noise and cosmic variance do not pose an insurmountable problem. Both the 
MWA and LOFAR can make reasonably precise measurements at z < 10 and 
k < 1 Mpc -1 , and the SKA can extend this to z > 10 and k < 10 Mpc -1 
(although foregrounds probably prevent measurements at k < 0.01 Mpc -1 ). 
The differences with redshift are mostly because of the rapidly increasing sky 
background toward lower frequencies (see eq. 136): clearly, our prospects for 
signal detection rapidly worsen toward higher redshift. It is for this reason that 
exploring the first galaxies and bound structures, as well as the dark ages, will 



Unfortunately, t{ nt cannot be increased to arbitrarily large values because sys- 
tematics will eventually intervene. Current estimates suggest that integrating for 
several hundred hours should be reasonable, but going too much beyond that is 
likely to be difficult. 



145 



be much more difficult than reionization, even though the intrinsic signals are 
comparable. 

The most surprising aspect of Figure 30 is that the MWA and LOFAR have 
comparable performances, despite the nearly order of magnitude difference 
in collecting area. There are several reasons for this. First, the MWA has a 
much larger field of view as a consequence of its significantly smaller antennae, 
allowing it to beat down statistical errors. Second, the smaller antennae of the 
MWA allow it to sample modes with k± < 0.03 Mpc~ x , which LOFAR cannot 
do (hence the cutoff near this value in the LOFAR curves); on the other 
hand, its extremely compact configuration does not allow it to observe any 
modes with kj_ > 0.5 Mpc -1 . For many radio observations, this poor angular 
resolution would be a crippling disadvantage - and that is why LOFAR, meant 
as a more general-purpose radio telescope, has both long and short baselines. 
But for this particular application, high-/c modes can be sampled when they 
are oriented along the line of sight. 

Finally, the thick dotted curve in the middle panel of Figure 30 shows the 
errors if the MWA has a flat baseline distribution (rather than r~ 2 ). This 
configuration reduces the sensitivity on all scales, which may seem surprising if 
one is used to measuring angular power spectra: in that case, flat distributions 
are more sensitive to high-A; modes (e.g., [158]). But, thanks to the frequency 
dimension, even compact arrays can measure small-scale structure. 

Figure 31 shows another view of the MWA sensitivity [379], this time with 
an explicit comparison to some analytic models of reionization [341]. The Fig- 
ure assumes a similar experimental setup as before, except with a 360 hour 
integration over 8 MHz at z — 8. The points show simulated data (again ne- 
glecting systematics) with la errors; 48 the dark gray regions show the error 
envelope from cosmic variance (binned in the same manner as the data), and 
the light gray regions include thermal noise as well. The solid curve is the 
power spectrum if Xi = 0; the other curves show predictions for the signal at 
various stages of reionization from [341]. The purpose of this figure is to show 
explicitly that, over the range 0.01 Mpc -1 < k < 0.2 Mpc -1 , the next gener- 
ation of radio telescopes will be able to distinguish models at high confidence 
levels, provided that the thermal noise limit can be reached (the low- A; cutoff 
comes from foreground removal; see below). 

To this point, we have only considered the spherically- averaged signal, which 
ignores the cosmological information available from redshift-space distortions. 
Figure 32 shows the sensitivity of the MWA (medium width curves) and the 
SKA (thick curves) to anisotropies in the 21 cm signal [278]. The thin solid and 

48 Note that some of the simulated datapoints have negative power here; this is a 
result of the simplified error propagation used by [379]. In a realistic experiment, the 
estimator will be constructed to preserve the positive-definite nature of the signal. 
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Fig. 31. Isotropic power spectrum sensitivity of the MWA at z = 8. The dark gray 
regions show errors from cosmic variance; the light gray regions include thermal 
noise as well. The data points show a simulated realization of the measured power 
spectrum with la errors. The solid line assumes Xi = 0, while the dashed lines 
assume = 0.51, 0.43, 0.38, 0.25, and 0.13, from top to bottom [341]. The setup 
assumes an 8 MHz band observed for 360 hours. From [379]. 



dashed curves show P M o and the others show the errors in bins of width 
AA; = k/2. Here we have assumed 2000 hours of integration time split between 
two fields at z — 8. The two panels assume different stages of reionization; 
at Xi — 0.1 the anisotropic component is strong because density fluctuations 
dominate, but it is weak at x^ = 0.7 when the bubble signature (which is 
isotropic) dominates the power. The Figure shows how difficult it will be to 
extract detailed information on the angular dependence of the 21 cm power 
spectrum, especially for the MWA. The MWA sensitivity to the /i 4 component 
is sharply peaked because it relies on the frequency axis to provide all of its 
high-A; information. Thus it cannot constrain the angular dependence of the 
power. The larger collecting area and longer baselines of the SKA allow it 
to measure transverse modes to k ~ 1 Mpc -1 . However, even it will have 
difficulty seeing the // component during reionization. 
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Fig. 32. Power spectrum sensitivity to the fully anisotropic signal. The thin solid 
and dashed lines show P^o and P^a for Xi = 0.1 (right panel) and Xi = 0.7 (left 
panel). The medium width and thick curves show the expected errors in bins of 
width Ak = k/2 for the MWA and SKA, respectively (solid and dashed correspond 
to the fjP and // 4 components). We assume 2000 hours of integration split equally 
between two fields at z = 8. From [278]. 



Because the redshift-space distortions (and the AP effect) contain the most 
robust cosmological information, firm measurements of fundamental param- 
eters will be difficult to extract from 21 cm experiments. In the absence of 
the fi n terms, these surveys can only measure the matter power spectrum 
over a relatively limited range of scales. Thus, only when combined with other 
datasets (such as the CMB) will they be able to effectively constrain cosmolog- 
ical parameters [278,399,401]. The first generation of experiments will likely 
be restricted to studying features from reionization and to offering marginal 
improvements on cosmological parameter determinations from other meth- 
ods (see §4.2). SKA-class instruments can significantly improve constraints 
on parameters that depend on the small-scale power spectrum, such as the 
primordial scalar spectral index n s and the neutrino mass. But, of course, any 
such attempt to extract fundamental parameters relies on understanding the 
astrophysical factors as well. Cosmological studies will be most straightfor- 
ward during an epoch in which Xi and fluctuations in Tg are both small. Such 
an era may or may not exist (see §3); if not, parameter estimation will likely 
be degenerate with these astrophysical processes [401]. 
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9.3 Foreground Removal Strategies 



As mentioned in §9.1.1, foregrounds are formidable at such low radio fre- 
quencies, where the mean brightness temperature of our Galaxy, the primary 
foreground, exceeds that of the expected 21 cm signal by at least four orders of 
magnitude. Fortunately, while this large mean brightness resets the zero point 
and adds a great deal of noise, it does not introduce systematic difficulties: 
instead, we are interested only in its fluctuations on the sky. These are not yet 
well-studied at the frequencies and angular scales relevant to 21 cm studies. 
Extrapolation from CMB foreground studies (e.g., [402,403]) suggests that 
Galactic fluctuations will be relatively gentle on arcminute scales [272]. An 
observational program at the Westerbork Synthesis Radio Telescope using the 
Low Frequency Front-End (LFFE) receiver system, 49 covering 120-180 MHz, 
is underway to measure the fluctuation spectrum of the Galactic emission. 
However, even if these turn out to be small, the 21 cm signal is still swamped 
by temperature fluctuations from extragalactic sources: radio galaxies [404] 
and free-free emission from the reionizing sources themselves [274, 361, 405] 
both create fluctuations that exceed the signal by one or two orders of mag- 
nitude (for a unified treatment of extragalactic foregrounds, see [406]). 

Fortunately, these difficulties are not insuperable: it should be possible to re- 
cover the 21 cm signal through its structure in frequency space, because all the 
foreground contaminants mentioned above are spectrally smooth [158,294,397, 
407]. In other words, while the 21 cm signal is expected to be isotropic in 3D 
space (neglecting redshift space distortions and evolution), foregrounds have 
strong fluctuations in the transverse direction across the sky but weak ones 
in the radial direction. There are other foregrounds that are not spectrally 
smooth, including radio recombination lines (§9.4.1) and terrestrial radio fre- 
quency interference, both atmospheric and man-made (§9.4.2). The telescope 
response can also introduce frequency structure: sidelobes change with fre- 
quency (§9.1.3), implying that different parts of the sky (with different bright- 
ness temperatures) are surveyed at different frequencies [274]. Perhaps most 
worrying, in large part because so little is known about it, is whether polarized 
features in the Galactic foreground will suffer significant Faraday rotation. The 
polarized component can have brightness temperatures of several Kelvins and 
does have substantial structure on arcminute scales [408-410]. If it is Faraday 
rotated on frequency scales of ~ 1 MHz - not at all unreasonable at these 
frequencies, given the known Galactic magnetic fields - the polarized Galactic 
signal would appear as a chromatic fluctuation in the measured total inten- 
sity if the polarization response of the instrument is not perfectly calibrated 
(see §9.1.4). This is another question being addressed by the LFFE system 
at Westerbork. If these conditions occur, then the onus falls on the precision 



See http://www.ursi.org/Proceedings/ProcGA05/pdf/J03-P.14(0817).pdf. 
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of the calibration algorithms to properly apportion the polarized component 
between the Stokes Q and U parameters without falsely coupling to variations 
in total intensity that could be mistaken for the redshifted 21 cm line signal. 
It is not yet clear how stringent these calibration requirements must be and 
whether they can be realized in practice. In general, these frequency-dependent 
foregrounds could be much more difficult to remove and require specialized 
techniques. Here we focus instead on continuum foreground removal. 

As mentioned above, the essential property employed by cleaning techniques 
is the strong frequency coherence of foregrounds. Indeed, if foregrounds were 
perfect power laws, then they could be trivially removed. However, (i) the 
Galactic foreground spectral index fluctuates as a function of both frequency 
(steepening toward lower frequencies) and position, and (ii) extragalactic fore- 
grounds are a sum of power law spectra with different spectral indices; the 
result in general will not be a power law. 50 The stochastic decoherence of fore- 
grounds as a function of frequency sets a fundamental limit on the efficacy of 
foreground removal. These properties are well-understood in the CMB litera- 
ture [412,413] and have recently been applied to the 21 cm problem [158,407]. 

It is simplest to begin by neglecting practical difficulties of the telescopes. We 
imagine an ideal instrument and ask whether foregrounds impose an absolute 
limit on the sensitivity. Although we have emphasized the three-dimensional 
nature of the 21 cm signal, foreground removal is perhaps most intuitively 
understood with the angular power spectrum, because that explicitly sepa- 
rates the angular power spectrum (which is contaminated) from the (much 
more pristine) spectral fluctuations used for the cleaning. To quantify the fre- 
quency decoherence between maps at two different frequencies, consider the 
correlation coefficient [158,407]: 

,) = -°ib&—«l-*£gM (160) 
yCi(i/i,i/i)Cj(i/ 2 ,v2) ^ 



where the second step represents a Taylor expansion about a power law, and 
(since all odd terms must vanish) is the lowest order deviation from complete 
correlation. The parameter £ is the "correlation length;" smaller values of £ 

50 Note as well that continuum foregrounds have not been probed with such high 
frequency resolution before; one might worry for instance that the Galactic fore- 
ground could exhibit tiny ~ 10~ 4 temperature blips at these small frequency scales 
due to irregularities in the electron momentum distribution in synchrotron knots. 
Fortunately, this is highly unlikely because (i) one averages over a huge number of 
electrons along each line of sight, and (ii) the range of frequencies over which a single 
electron emits either synchrotron or free-free radiation is broad. The convolution 
of this broad emission profile with the electron momentum distribution washes out 
any sharp features in the latter [411]. 
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imply more rapid decorrelation over a fixed frequency separation. Note that 
li can be /-dependent (e.g., if sources with different spectral indices a have 
different power spectra Ci on the sky), so in general decorrelation will also be 
scale-dependent. 

Obviously, the crucial parameter is £. What is its value for realistic fore- 
grounds? For Poisson fluctuations, £ = 1/Aa, where Aa = ((a — a) 2 ) 1 / 2 is 
the dispersion in spectral indices among sources [412]. If, on the other hand, C\ 
is dominated by clustering, and in the simple case d(lnCz)/d(lna) = 0, then 
£ = T tt /Aa 2 , assuming that angular correlations between sources with spectral 
indices a 1 and a 2 fall off as exp[— (aii — a 2 ) 2 /(2a' 2 J] [158]. To the extent that 
all sources trace the underlying dark matter distribution, we expect them to 
be well-correlated on the sky. Even if they have different linear bias factors, we 
would have perfect correlation a a = oo so long as they all still trace the same 
dark matter distribution; only the stochastic part of the bias contributes to 
the decorrelation. Thus, we expect a a ^> Aa; a firm limit might be a a = Aa. 
In that case, even though source clustering is expected to be the dominant 
source of angular fluctuations [274,404], the Poisson component likely provides 
an upper bound to frequency decorrelation. For extragalactic radio sources, 
Aa « 0.2 [414], and for the Galactic foreground, Aa « 0.15 [413]. 51 

Thus, assuming £ as 1/Aa « 5, v\ — 150 MHz, and Av = 1 MHz, equation 
(160) yields (1 — I) ~ (1-9) x 10~ 7 . It can be shown from a simple x 2 or Fisher 
matrix calculation that fitting for this frequency coherence will suppress fore- 
ground power spectra by a factor (1 — 7); thus, as long as (1 — I)C\ g <C Cf lcm , 
foregrounds can be effectively removed. Even for the most pessimistic esti- 
mates, the raw foreground spectrum has Cf g /Cf lcm < 10 5 [406], so suppression 
by (1 — I) suggests that residual foreground contamination should be small. 
In fact, cleaning can be even more effective, because this estimate used only 
one pair of maps, instead of the N(N — \)/2 pairs generated with N frequency 
channels. However, their errors are correlated, and difference maps from larger 
frequency separations have greater frequency decorrelation, so the net errors 
on N maps will improve more slowly than ~ [(N — 1)/2] 1//2 . More precise error 
forecasts can be achieved via a Fisher matrix calculation [407]. 

Over large bandwidths, the increasing decorrelation with larger frequency 
separation implies that the Taylor expansion in equation (160) is no longer 
valid, so the (unknown) exact form of the coherence function, Ifai,^) — 
/(log[z/i/z/ 2 ]/0 becomes important. The results can then be counterintuitive 
and careful study is required [407]. One could attempt to fit a parametric func- 



Note that the dominant contribution to both of these spectral index fluctuations 
is synchrotron emission; the flat spectrum of free-free emission - which varies only 
slightly due to the temperature dependence of the Gaunt factor and the exponential 
cutoff - provides Ao ff ~ 0.03 [407]. 



151 



tion to the coherence function of the data, which could improve the accuracy of 
the analysis. Two general points are worth noting. Accuracy diminishes toward 
the ends of the frequency interval, since there are fewer neighboring channels 
to correlate against. Also, these estimates assume that the 21 cm signal has 
no frequency coherence, with each channel probing a statistically independent 
slice of the universe. Obviously this assumption fails with narrower channel 
widths. This (together with the increasing receiver noise) imposes a limit on 
how narrow a bandwidth one should use for foreground cleaning, even though 
in principle / — > 1 as Av — > 0. The relative distribution of large- and small- 
scale power in foregrounds and the 21 cm signal also affects the appropriate 
set of basis functions to use for foreground cleaning, as described below. 

Thus the astrophysics itself does not present an insurmountable problem. 
Much more worrisome is how this blanket of sources couples with practical 
telescope design issues. The most important aspect is that the diffraction limit 
of the telescope (and hence the uv vector corresponding to each baseline, the 
field of view, and the sidelobes) scales with wavelength. Thus the beamsize in- 
creases steadily toward smaller frequencies, and new sources are added to each 
frequency channel. This provides another - and probably more important - 
source of de-correlation than the spectral index fluctuations [274]. For example, 
in the simplest approximation we would have / ~ 1 — (za/z^) 2 from adding new 
sources (with a constant density on the sky) to the beam. With v\ = 150 MHz 
and Av = 1 MHz, we would therefore have 1 — / ~ 4 x 10~ 5 , which could 
introduce fluctuations comparable to the cosmological signal. Clearly, careful 
attention must be paid to the beam pattern. Moreover, the changing uv cover- 
age implies that simply binning the data in a frequency-independent manner 
will introduce artifacts into the measurements that must be modeled properly, 
especially when the uv coverage is not dense. Monte Carlo runs with simulated 
data will allow us to understand the amplitude of such effects, and develop 
optimal binning strategies. 

How will foreground removal be done in practice? The first step is to excise 
all bright point sources which exceed some detection threshold. Ideally, this 
should be done directly from the uv visibilities to account for the instrumental 
response and sidelobes, though reality may require more complicated algo- 
rithms (see §9.1.5). The remaining unresolved sources (and fitting errors from 
the subtraction of bright sources) are then the target of foreground cleaning 
procedures. 

The next step excises the spectrally smooth component of the observed sky. Of 
course this will be done on the three-dimensional datacube, so the algorithms 
differ in detail from the ones we have described above. In fact, there are many 
possible variants with no one optimal procedure for foreground cleaning; the 
best solution will likely depend on what statistic one is trying to extract 
from the data. The most widely discussed approach - which we focus on - 
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is to fit and subtract a smooth function to the spectral data, generally on a 
pixel-by-pixel basis (e.g., [278,294,411,415,416]). This process has a rigorous 
statistical justication as a means of extracting a tiny fluctuating signal from 
a huge, slowly varying background [417] (in statistical parlance, it is often 
called trend removal). Similar problems crop up in time series analysis, and 
trend removal has also been used in analysis of quasar absorption spectra to 
estimate the underlying continuum before extracting the Lja forest [418]. We 
summarize the present understanding for 21 cm foreground cleaning below. 

Consider the foreground amplitude f\ = /(k_|_,z/j) in a pixel with angular 
index kj_ and frequency z/j. We let the vector f denote / measured at the 

frequencies v — i/pj) with resolution Av. The fundamental idea is to fit 

f with a set of smooth basis functions. This corresponds to projecting out or 
marginalizing over the corresponding modes in the data [278] ; thus to remove 
the slowly varying foregrounds, we only fit low-order, slowly-varying basis 
functions. Such techniques are commonly used in CMB analyses to marginalize 
over the temperature monopole and dipole, which suffer strong contamination. 
If the set of basis functions is complete and orthonormal (such as the Legendre 
polynomials), then the error analysis can be performed analytically. 52 The 
residual foreground contamination after cleaning with n basis functions is: 

(n \ oo 

1-EPiPF f= £ P.Pff. (161) 
1=0 ) l=n+l 

The power spectrum of residual foregrounds is then given by [278]: 

QkjM)«-4r^%, ( 162 ) 



where the Fourier vector fi k oc exp[iyku]. Here w ~ \ 2 B 2 /(A e £ 2 A£); see 
§9.2 for definitions of its components. Q^ ± is simply the square of the Fourier 
coefficients of f, appropriately normalized. Foreground cleaning is effective 
provided that Qk ± (k,n) <C Pat- 

Of course, during this cleaning process, the signal s will be attenuated as 
well: s — > s = X^n+i P/Pf s. The fundamental assumption of this procedure 
is that the scales on which the signal and foregrounds have significant fluc- 
tuation power are sufficiently well-separated that the signal is unaffected by 
foreground removal (or, at least, one is willing to sacrifice the smoothly vary- 
ing modes in the signal). One must then determine the optimal cleaning basis 

52 Note that the most intuitive set of basis functions, polynomials in log(u) - corre- 
sponding to a Taylor series around a power law foreground - is not orthogonal, so 
the error analysis is somewhat more complicated. 
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Quadratic Cubic 




Fig. 33. Foreground removal for a 1000 hour observation with the MWA over a 
bandwidth B = 6 MHz at z = 8, using a quadratic (left panel) or cubic (right 
panel) Legendre polynomial. The dark thin solid curve is the 21 cm signal for a 
neutral universe. The thick top curves show the sensitivity for a single Fourier pixel, 
while the medium width bottom curves are the foreground residual power spectra 
(eq. 162). In each set, the solid, dashed, and dot-dashed curves assume that 6, 12, 
and 24 MHz spectral chunks are used to fit the foregrounds. Note how foreground 
cleaning reduces sensitivity on large scales, especially for a high-order polynomial 
fit to a small spectral length. On the other hand, the same procedure also more 
effectively removes foregrounds. From [278]. 

Pi, which will maximize (ss>) while minimizing (ff^). For instance, the order 
n of a fitting polynomial plays an analogous role to the smoothing parameter 
in any regularization problem, yielding a trade-off between smoothness and 
fidelity to the data. If n is too low, there are insufficient degrees of freedom to 
remove the foregrounds efficiently; if n is too high, some of the cosmological 
signal is removed. The choice of expansion basis (Chebyshev polynomial, Leg- 
endre polynomial, broken power law, smoothing spline, etc) likewise affects 
the relative amounts of foregrounds and cosmological signal removed [411]. 

Fortunately, the effects of a particular basis choice on a particular power spec- 
trum can be calculated a priori (thus quantifying the amount of downward bias 
in the recovered large scale 21 cm power spectrum). Figure 33 (from [278]) 
shows an example. As mentioned above, foreground cleaning reduces the sen- 
sitivity to the signal at large scales, and increasing n removes more of both 
the cosmological signal and the foregrounds. The scales over which foreground 
cleaning is performed also plays an important role: over a smaller bandwidth, 
more of the small-scale cosmological signal is removed (conversely, over a larger 
bandwidth, a higher order polynomial is required to fit the foregrounds effec- 
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tively). A simple rule of thumb is that, for a given bandwidth, one should 
choose the minimum n such that Qk ± (k,n) <C PAr(k±, [278]. 

Of course, each stage of foreground removal (including point source removal 
and spectral fitting) is imperfect. Thus the final stage, performed simultane- 
ously with the signal extraction, is to fit for these residuals statistically using 
prior knowledge of their systematic shapes [416]. The details of this phase will 
depend sensitively on the instrument and algorithms chosen, so its effective- 
ness has not yet been explored quantitatively. 

The requirement in these foreground removal algorithms that the signal and 
foregrounds fluctuate on highly disparate scales appears to impose a funda- 
mental limit on our ability to probe the large-scale 21 cm signal. However, 
at least in the high S/N regime where individual HII bubbles can be im- 
aged (§8.4), this limitation can be broken: since HII regions should contain 
pure foreground emission, they provide firm calibration points that can be 
renormalized to <5T& = K in the foreground-cleaned maps. Calibration using 
Monte Carlo simulations of an analytic bubble size distribution (from [341]) 
show that, without such a correction, any measurement of the pixel PDF or 
of 5T b (z), as well as the recovered map itself, will have severe artifacts from 
large scale unphysical features introduced by foreground removal [411]. How- 
ever, once the recalibration is done, fidelity to the input maps - as well as 
detailed statistics, including the PDF and large-scale power spectrum - is re- 
markable, except possibly near the box edges. Figure 34 shows an example of 
the reconstruction. The main limiting factor becomes the number density of 
sufficiently large bubbles that can be unambiguously identified as such. Their 
number density must be of order n\j^ h ~ k to perform reliable foreground 
recalibration on wavenumber k. 

As with the optimal set of basis functions, many of the details of foreground 
cleaning - such as the appropriate space to work in (real, Fourier, or some 
admixture) - depend on the details of one's application (e.g., foreground re- 
calibration is only possible in real space), as well as the particulars of array 
design. They are likely best addressed in end-to-end simulations where full 
error probability distributions (and not just the Fisher matrix approximation, 
because systematic errors introduced by cleaning may not be gaussian) and 
biases (such as the reduction of large scale power) can be carefully quantified. 



9.4 Other Systematics 

The Galactic and extragalactic backgrounds (including the polarized compo- 
nent; see §9.1.4 and 9.3) are only two of many systematic concerns for these 
experiments. Below we will briefly describe three additional concerns: fore- 
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(a) Input box 



(b) Recovered box 



Fig. 34. A simulation of foreground removal for the SKA, when telescope noise is low 
and 21 cm tomography might be possible. The main features of the box, especially 
the largest ionized bubbles (shown in black), are robustly recovered. This simulation 
utilizes a fifth-order Chebyshev polynomial basis for foreground removal, as well as 
the additional step described in the text of zeroing the large bubbles to calibrate 
the foregrounds. The box is ~ 400 comoving Mpc across at z ~ 9 and contains 128 3 
pixels. From [411]. 

ground lines, terrestrial interference, and ionospheric distortion. 
9.4-1 Radio Recombination Lines 

Unlike free-free and synchrotron emission, foreground radio recombination 
lines (RRLs) can introduce significant structure in frequency space. These 
lines, which are generated by recombination cascades through high-n elec- 
tronic levels in HII regions, could therefore be serious contaminants. Little is 
known about the RRL background at the low radio frequencies accessed by 
redshifted 21 cm instruments; indeed, these experiments could turn out to be a 
major source of new information about both galactic and extragalactic RRLs. 
Here we briefly summarize our existing knowledge of the RRL background (for 
a comprehensive review, see [419]; see also the discussions in [272,274,420]). 

The most likely source of contamination is our Galaxy. The frequency of a 
hydrogen RRL between levels n and n — An is 



Observationally, the lines tend to occur every 1-2 MHz over the frequency 




(163) 
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range of interest. Galactic ridge RRLs make the transition from emission to 
absorption in the range 100-200 MHz and so are at a minimum at the relevant 
frequencies. They can reach peak brightness temperatures of Tl = 1 K, but 
their narrow line widths (Ais ~ 3 kHz at 100 MHz) imply that their spectral 
occupancy is quite small. The first line of defense is therefore to simply excise 
contaminated regions of the spectrum (which can be identified a priori); all 
of the experiments must already work at high spectral resolution to excise 
terrestrial interference (see below). 

The degree to which this will be necessary depends on whether RRLs will 
provide a significant source of contamination along lines of sight outside the 
Galactic plane (see Fig. 25). Unfortunately, there have been only a few RRL 
line searches outside the Galactic plane (except toward known bright sources), 
and in any case RRL surveys generally have low-frequency detection limits well 
above the strength of the 21 cm features we seek. We can, however, place a 
useful limit using the fact that the observed brightness of both RRL and Ha 
emission depends on the emission measure EM = J dsn 2 , where n e is the 
local electron density and s is the path length along the line of sight, and that 
optical Ha surveys have much lower limiting sensitivities. Fabry- Perot surveys 
have detected Ha emission from every Galactic latitude, with minimal values 
0.25-0.8 Rayleighs 53 toward the Galactic poles [421]. The emission measure 
corresponding to an observed Ha intensity I a (in Rayleighs) is: EM(Ha) = 
2.75 T®- 9 I a cm -6 pc, where T 4 = T e /(10 4 K) and T e is the temperature of the 
emitting region. Assuming local thermodynamic equilibrium (LTE) and that 
T L « r RRL T e (see §2.1), we find [274]: 

Thus, unless stimulated emission becomes important at low frequencies, the 
Galactic hydrogen RRL background should be negligible away from the Galac- 
tic plane. 

Carbon radio recombination lines are another possible source of contamina- 
tion. They have been detected at v — 34-325 MHz toward Cass A in the 
Galactic plane [422] and also in a 327 MHz survey centered at Galactic lon- 
gitude / = 14° [423]. Optical depths are generally r ~ 10 -3 , and the lines are 
thought to arise in dense, cold (T e ~ 20-200 K), partially ionized regions where 
non-LTE effects are important. Carbon RRL emission is generally confined to 
latitudes b < 3°, and its intensity out of the plane is probably small [423]. 

Because Galactic lines can probably be excised anyway, a much more wor- 
risome contaminant could be emission from unresolved extragalactic sources: 

53 Note lRayleigh = 10 6 /47r photons cm -2 s^ 1 sr _1 . 
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since these would be randomly distributed in redshift and hence frequency, the 
line structure would be much harder to remove. There are no observations of 
extragalactic RRLs at these low frequencies and hence any estimate has con- 
siderable uncertainty, but at present such contamination is not expected to 
pose a significant threat [420] . Neglecting stimulated emission, a star forming 
galaxy producing a brightness temperature perturbation Tl ~ 0.01 K would 
have to lie within a distance 

where 9 is the angular resolution of the beam. Such bright nearby sources 
can easily be identified and excised. The integrated RRL background from 
unresolved sources is similarly small [420]. However, one large uncertainty is 
the role that non-LTE effects and stimulated emission could play in boosting 
the RRL flux; these may be important at low frequencies because trrl oc 
u^ 1 . It was previously thought that emission stimulated by the continua of 
radio galaxies and quasars could allow much more distant sources to be seen 
[424]. But existing observations at higher frequencies (e.g., at 1.4, 8.1, 84, 96, 
and 207 GHz [425]) indicate that emission stimulated by luminous sources 
outside HII regions is unimportant, probably because the volume filling factor 
of HII regions around radio quasars is small [426] (stimulated emission from 
internal sources may be important, but it does not significantly boost the flux). 
Furthermore, the observed line flux falls off unexpectedly rapidly towards lower 
frequencies. Deviations from these expectations would be interesting in their 
own right, useful in probing the ISM of galaxies. 

Extragalactic RRLs could also be a foreground in the search for 21 cm absorp- 
tion against high-redshift radio-loud sources (see §10) [274]. Since high-redshift 
objects are potentially much denser, n 2 e oc (1 + z) 6 , their emission measures 
could be much higher. Thus, for instance, a high-redshift disk galaxy could 
have r RRL ~ 10~ 2 , comparable to that of the IGM or a minihalo in 21 cm 
absorption. If such disks are abundant, this could make the task of picking 
out the 21 cm forest lines much tougher, though the RRL features themselves 
would be a marvelous probe of gas clumping. However, the existence of such 
RRL absorbers is extremely speculative, and only a small fraction of lines of 
sight are expected to intersect disks [427]. 

9.4-2 Terrestrial Radio Frequency Interference 

Modern society's needs for high speed communication and precise navigation 
rely on intensive use of the radio spectrum. Strong signals from these services 
are present in any location in which people live in significant numbers and 
form one of the strongest "foregrounds" affecting radio astronomy. In addition 
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to the strong signals broadcast within specifically allocated frequency bands 
(through national and international coordination of the radio spectrum), there 
is a forest of spurious emission permeating areas of high population density, 
because most electrical equipment emits low levels of radio emission that falls 
below legal thresholds set by governmental agencies but whose cumulative 
effect is substantial given the high sensitivity of radio telescopes. 

Radio astronomy's first line of defense against this radio frequency interference 
(RFI) is to situate telescopes in remote locations, far from dense human habita- 
tion. Unfortunately, as the world's population and the demand for broadband 
communication services increase, it is becoming difficult to escape RFI. For 
example, satellites used to obtain global access to instantaneous communica- 
tions cause some radio bands to be occupied at every place on Earth at all 
times. 

Through the national and international agencies that coordinate spectrum 
use, radio astronomers have successfully negotiated legislation that provides 
for a number of narrow "protected" bands centered on selected bands of astro- 
physical interest, such as the rest frequency of the 21 cm line. Of course, it is 
impossible to reserve the sort of large bandwidths necessary for fundamentally 
broadband scientific applications such as ours. 

Preparation for new observatories, such as the SKA, includes intensive politi- 
cal discussion to define "radio quiet zones" in areas of low population density 
that are unlikely to see future growth, in order to provide some local protection 
for the considerable investment needed to build and operate the telescopes. 
These zones will enforce limits on nearby transmissions, but, due to the high 
sensitivity of radio telescopes, they are still vulnerable to distant high-power 
broadcasts and satellites. In fact, distant TV broadcasts and other ambient 
RFI problems have already ruined an early attempt to use the VLA to ob- 
serve the putative HII regions around the SDSS quasars (L. Greenhill, private 
communication) . 

Figure 35 compares the radio environment at three sites in Australia with a 
range of population densities (Sydney with population ~4 million, Narrabri 
with ~ 4, 000 and Mileura Station with ~4). The plot spans 100-300 MHz, 
corresponding to z — 13.2-3.7 for the 21 cm line. The tail end of reionization 
(at z ~ 6.2) falls at 197 MHz. This is within the high-VHF television band, 
which spans 174-220 MHz throughout most of the world. The band at 87.5 to 
108 MHz finds heavy use around the globe for FM radio. 

The frequency range from 120 to 140 MHz includes aircraft communications 
and a low earth orbit satellite band at 137 MHz; these are hard to avoid since 
air traffic and satellite transmitters are likely to pass within the reception 
patterns of nearly all Earth-based telescopes from time to time. Fortunately, 
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Fig. 35. Radio environment at three Australian locations: a suburb of Sydney, NSW; 
the Australia Telescope Compact Array site near Narrabri, NSW; and the Mileura, 
WA site (proposed by Australia for the SKA). The FM radio band (87.5 to 108 MHz 
throughout most of the world) enters the spectrum at left. The strong features in 
the center of the plot (174 to 230 MHz) are Australian high-VHF television channels 
6, 7, 8, 9, 9A, 10, 11 and 12. Shown courtesy of A. Chippendale and R. Beresford 
(taken as part of the ATNF SKA Site Monitoring Program). 



these signals have extremely narrow bandwidths, and, in the case of aircraft 
communications, have only a tiny temporal duty factor. This means that they 
can be efficiently edited (or "blanked" ) from the data stream. 

Not surprisingly, Figure 35 shows that the radio spectrum in densely populated 
places is saturated, and nowhere in the Sydney spectrum does the ambient 
power level drop down to the baseline set by the instrumental sensitivity. 
Fortunately, remote locations such as Mileura are able to escape (for the most 
part) the radio and TV contamination of modern civilization. However, it 
must be noted that the sensitivity of the measurements shown in the figure 
was limited by the equipment used in this monitoring exercise; here -200 db 
(W m~ 2 Hz -1 ) corresponds to radio astronomical flux densities of 10 6 Jy, 
while observations of reionization will need to be sensitive to ~ 10 /zJy signals. 
Fortunately, more sensitive measurements [428] at the Mileura site show that 
few signals appear in the blank areas of the spectrum in Figure 35, even after 
long integrations sensitive to a small fraction of the sky brightness. 



160 



Technological advances in high speed digital electronics are now permitting 
the development of a range of RFI "mitigation" techniques whose goal is to 
permit observations in bands contaminated with human-generated signals. 
This is necessary in studies of spectral lines, such as the redshifted 21 cm line, 
that require measurements in frequency bands already in use. The algorithms 
and hardware now being developed are the subject of several recent reviews 
[429-431]. 

RFI mitigation methods include a number of different approaches. Some focus 
on efficient automated editing and blanking of signals that have rapid or dis- 
tinctive temporal variability or that are confined to narrow frequency ranges. 
Since 21 cm emission from z > 6 is relatively smooth on scales < 1 MHz, 
narrow signals with widths < 50 kHz can be discarded and the broader band 
of interest constructed from the clean bands between the RFI spikes. 

Other techniques are being developed to identify, characterize, and ultimately 
subtract (or cancel) interfering signals [429-431], permitting astronomical ob- 
servations to be performed in actively used communication bands. So far, these 
experimental methods work best for broadcast transmitters that are fixed to 
the Earth, and there are limits to the achievable precision of the cancellation. 
Thus the first line of defense remains locating radio telescopes in the most 
benign environments available. 

Telescope design also plays a role in mitigating interfering signals. For exam- 
ple, the antenna tiles illustrated in Figure 27 project nulls in the their response 
patterns toward the horizon. This discriminates against terrestrial transmit- 
ters. Furthermore, tiles on the ground gain additional protection from hills 
and foliage. But, like all radio telescopes, they remain vulnerable to satellite 
emission. 

9.4-3 Ionospheric Distortions 

Because of its temporally and spatially variable index of refraction, the Earth's 
ionosphere distorts low frequency radio signals as they propagate through it. 
This creates significant calibration and imaging problems that must be solved 
in order to clean the strong foreground contamination reliably. 

Figure 36 is a cartoon sketch of the wavefront distortion experienced by a 
compact array [432]. Lines of sight from all the antennae in the array observe 
a given celestial radio source through nearly the same "refractive wedge," 
which to first order causes a linear gradient to develop in the phase of a 
wavefront as it passes through the ionosphere. This causes the array to perceive 
the wave to originate from a different direction on the sky than the true 
one. Wavefronts arriving from sources at different locations will pass through 
independent wedges, so their apparent positions will be displaced by different 
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Fig. 36. Cartoon illustration of ionospheric distortions. Wavefronts (represented by 
dashed lines) are distorted as they pass through the phase screen of the ionosphere 
(solid wavy line) and develop nearly linear phase gradients, which then project onto 
the array. The gradient introduced by the screen varies with position across the sky. 
From [432]. 



amounts. The net effect is that the radio sources will remain coherent to the 
array, but they will appear to wobble on time scales of tens of minutes (the 
characteristic scale of fluctuations in the ionosphere). Calibration will require 
the inclusion of algorithms to track and update an ionospheric distortion model 
using an all-sky grid of bright radio sources. The problem and its solutions are 
qualitatively similar to adaptive optics in optical astronomy, although here all 
the corrections will be done on the software (rather than hardware) level and 
the spatial and temporal scales are significantly larger. Foreground subtraction 
and imaging will rely on the corrected catalog of source positions. 

In addition to the variable index of refraction, the ionosphere also induces 
variable Faraday rotation, which rotates the position angle of the electric po- 
larization vector so that polarized foregrounds artificially appear to vary in 
time at the telescope. This too must be evaluated and calibrated on time scales 
of minutes as part of the observational procedures. 

Considerable experience with ionospheric distortion has been obtained during 
the VLA Low Frequency Sky Survey at 74 MHz [433]. Among the principal 
findings has been the significant increase in complexity for interferometer ar- 
rays extending over more than a few tens of kilometers, because then the array 
samples many independent refractive wedges in the ionosphere. This further 
complicates both calibration and imaging. Fortunately, the 21 cm line observa- 
tions that concern this review need only short baselines of a few kilometers in 
length to be sensitive to the low surface brightness features with characteristic 
scales of arcminutes expected during reionization. 
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Fig. 37. Mock spectra of high-redshift radio sources at z s = 10 (left panel) and 
z s = 8 (right panel) produced from cosmological simulations. In each case a source 
(with intrinsic luminosity equal to Cygnus A) is placed at the appropriate redshift 
and "observed" in a week- long integration with an SKA-class instrument. A "forest" 
of absorption features appears blueward of 21(1 + z s ) cm, caused by the cosmic web. 
Note that the level of absorption depends sensitively on the assumed thermal and 
ionization history of the IGM. From [349]. 

10 The 21 cm Forest 



To this point we have focused on constructing large-scale three-dimensional 
21 cm maps of the IGM (or at least characterizing their statistical properties). 
Such observations promise to provide powerful constraints on reionization and 
early structure formation, but they pose three major difficulties: (1) they re- 
quire T s 7^ T 7 and may therefore be impossible during certain epochs; (2) 
they cannot realistically resolve structures smaller than ~ 1 Mpc; and (3) 
they present a number of observational challenges (see §9). 

A complementary probe immune to most of these problems is the "21 cm for- 
est" [349,427,434]. The technique, illustrated in Figure 37, is the exact analog 
of the Lja forest that has proved so useful for studying the z < 6 IGM. Neu- 
tral hydrogen along the line of sight to a distant radio source will resonantly 
absorb photons that redshift into the 21 cm transition, creating a forest of 
absorption features in the spectrum due to the diffuse IGM, sheets and fila- 
ments in the cosmic web, minihalos, and HII regions. The major advantage 
of this method is that, with a sufficiently bright source, arbitrarily high fre- 
quency - and hence spatial - resolution can be achieved, allowing us to probe 
individual filaments and minihalos. Moreover, because it requires only a high 
signal-to-noise spectrum of a relatively bright radio source, the 21 cm forest 
is a much simpler observation than tomography. 
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Fig. 38. Mean optical depth in some example reionization histories, (a): Ionization 
history, Xi(z). Note that Xi is independent of fx, so we only show two curves here. 
(b): Optical depth r. The models follow those in Figs. 7 and 8. From [434]. 

10.1 Spectral Features 



The fundamental quantity of interest for the 21 cm forest is the optical depth, 
given by equation (12) for an isolated cloud or equation (15) for the IGM. At 
high redshifts, the latter becomes 



0.011 x m (1 + 6) 



1 + 3 

10 



3/2 



dvn/dr\\ 



(166) 



As usual, the observable quantity is the brightness temperature decrement 
5Tb. However, in this case the background radiation field comes from a source 
with brightness temperature T b ^> T s ; thus 5T b oc Tg 1 . Unlike tomography, 
the forest's brightness always depends on T s . 

We will consider four aspects of the forest. The first is the mean level of 
IGM absorption, which is determined by xm(z) and Ts(z). We have seen 
in §3 that these quantities depend on a number of unknown parameters in 
the star formation history, including their spectral energy distribution, star 
formation efficiency, escape fraction, and stellar initial mass function. We can 
use the same models as in §3 to study the qualitative features of the evolution. 
Figure 38 shows the mean optical depth in several of the models from Figures 7 
and 8. 
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At sufficiently high redshifts, before the Wouthuysen-Field effect has become 
significant, T$ = T 7 and f ~ 0.02. In most of the models, it then briefly 
increases by a factor of a few because Lya coupling becomes efficient while the 
IGM is still cold. However, this period is short-lived, because f declines rapidly 
once X-ray heating begins. As we saw in equations (89) and (90), reionization 
is delayed compared to Lja coupling and X-ray heating. As a result, by the 
time reionization begins in earnest and large HII regions appear, T$ is already 
large and f ~ 10~ 3 . This is a crucial characteristic of the 21 cm forest: in the 
most plausible scenarios, f is only large long before reionization. This is, of 
course, unfortunate, because powerful radio sources probably do not appear 
until powerful ionizing sources do. Thus lines of sight with strong absorption 
are likely to be rare. Simulations find similar results, even if X-ray heating is 
neglected (Fig. 37 and [349]). The right-hand panel of Figure 37 has f ~ 0.1% 
at Xi ~ 0.5 (or z = 8 in this particular simulation). Once X-ray heating is 
included, the prospects are even worse. 

The next set of features come from sheets and filaments. These stand out in 
Figure 37 as t > 2% absorption spikes. Naively, one might expect that these 
features would strengthen with time as the cosmic web condenses out of the 
IGM. But Figure 37 shows the opposite: they actually grow weaker, because 
the increasing spin temperature washes them out. Assuming the absorbers are 
near hydrostatic equilibrium (an excellent approximation for the analogous 
Lja forest [435]), the typical overdensity of an absorber with optical depth r 
is [434] 

, + ^ 94 (^r c^u 2 r («=*r\ f Miy . (167) 
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Thus, if the IGM has been even moderately heated, only virialized objects 
host measurable absorption, and the cosmic web spikes become rare. In the 
simulation of [349], Ts ~ 30 K throughout much of the IGM at z — 10, but by 
z — 8 it has risen to T s > 100 K everywhere. 54 Over the same interval, the 
number density of t > 0.02 lines drops from ~ 50 per unit redshift at z — 10 
to ~ 4 at z = 8. Again we see that strong cosmic web absorption probably 
requires finding bright radio sources that shine long before reionization begins. 
These features are also narrow, with Az/ Q b s ~ 2 kHz, determined by thermal 
and Hubble broadening. 

Minihalos also contribute to the forest [427]. We described 21 cm emission from 
these objects in §6.1; while it can be reasonably strong, it is nearly impossible 
to distinguish from that of the diffuse IGM without high sensitivity on small 
scales. But the 21 cm forest suffers no such confusion, because it allows us 

54 It is worth noting again that this simulation lacked X-ray heating, so it probably 
underestimates the temperature. 
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to resolve individual objects; indeed, it is the only known method to study 
these objects in detail. The excess optical depth through a minihalo depends 
in detail on its gas density profile and temperature distribution [427]. But we 
can estimate the typical central optical depths by assuming N m ~ ^Hi r vir ~ 
A v i r nH^vir (where A vir pa 187T 2 is the mean overdensity of virialized objects; 
see [436] for a numerical fit) and that Ts ~ Tk (because the gas is so dense). 
Then, using equation (13), we have (at line center) 55 



Thus the optical depths can even exceed those of sheets and filaments, and 
they are much more robust to uncertainties about the Lya background and 
the IGM temperature. The typical line widths (due to thermal broadening) 
are Az/ obs ~ 2 kHz, comparable to those of sheets and filaments. 

Figure 39 shows the resulting differential number densities of minihalo ab- 
sorbers at z — 10 and 20 (solid and dashed lines, respectively). At each red- 
shift, the top curves assume T K = T ad (the temperature without X-ray heating; 
shown in Fig. 6). The remaining curves use T K = 20, 100, and 1000 K, from 
top to bottom, to compute the Jeans mass. In a cold IGM, the number of 
minihalo features is large (with > 100 absorbers per unit redshift at z — 10) 
and they can be rather strong (with > 20 absorbers with r > 0.1 per unit red- 
shift). Minihalos are of course much rarer at z — 20, simply because structure 
formation is so much less advanced at that time. 

Clearly the number of absorbers decreases rapidly as Tk increases, especially 
for large optical depths. This is because the smallest minihalos produce the 
strongest features; increasing the Jeans mass by even a small amount sup- 
presses them. The weaker absorption features with r ~ 0.01 are more robust to 
T K because they come from larger halos. Comparison with simulations shows 
that minihalo features have comparable abundance to those of the cosmic 
web [349,427], although minihalos do probably persist longer during reioniza- 
tion once X-ray heating is included [434]. 

Measuring the frequency of minihalo features is thus a sensitive probe of the 
thermal history of the IGM. More sophisticated models of minihalo formation 
in the presence of thermal feedback show that the lack of a minihalo forest, in 
combination with large-scale 21 cm emission, could identify warm regions in 
which an HII bubble had recombined [321]. Such regions would be invisible in 
most other ways. 



The total equivalent width of a minihalo line is proportional to m 1 ^ 3 . 




(168) 



166 



0.01 



0.1 

T 



Fig. 39. Differential number density of minihalo absorption features at z = 10 
(solid curves) and z = 20 (dashed curves). From top to bottom, each set takes the 
minimum minihalo mass to be that of a uniform medium at T = T^z), 20, 100 
and 1000 K, from top to bottom. From [434]. 



Ionized bubbles create gaps in the absorption, with an amplitude equal to 
the mean optical depth of the neutral gas. Figure 40 shows the resulting 
distribution of absorption features (transformed into differential form) for 
Xi = 0.1, 0.3, 0.5, 0.7, and 0.9 at z — 10. It has a number of interesting 
features. First, except at Xi = 0.1, the distribution is flat with Az/ b s - This is 
because rib(m) has a well-defined characteristic size in this model. Crucially, 
because the bubbles are relatively large, the spectral features are quite wide 
- more than an order of magnitude larger than those of filaments or miniha- 
los, even when Xi = 0.1. Thus they may indeed be the simplest features to 
identify, despite their relatively small contrast (see Fig. 38). Second, the total 
number of HII regions per unit redshift remains roughly constant throughout 
reionization, from ~ 20 at Xi = 0.1 to ~ 6 at Xi = 0.9. This is because most 
ionizations occur as existing bubbles grow larger rather than by creating new 
bubbles [22]. However, the overall distribution of Az/ obs does evolve rapidly 
as Xi increases, so the forest will certainly prove useful in constraining the 
reionization history. 

Finally, for completeness, dense neutral gas inside galaxies (similar to DLAs 
at z < 4 [437]) can induce rather large optical depths. But such lines of sight 
are rare (dn/dz ~ 0.1 [427]), so collecting a large sample will be difficult. 
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Fig. 40. Size distribution of HII region forest gaps intersected per unit redshift at 
z = 10. The dotted, dot-dashed, short-dashed, long-dashed, and solid lines take 
Xi = 0.1, 0.3, 0.5, 0.7, and 0.9, respectively. From [434]. 

10.2 Is It Feasible? 



As we mentioned above, 21 cm forest observations are much simpler than to- 
mography, because they can essentially ignore most of the challenges discussed 
in §9. Their feasibility boils down to two questions: raw sensitivity and the 
existence of background sources. Detecting a feature with signal-to-noise ratio 
S/N requires a source brightness (assuming simultaneous observations of two 
orthogonal polarizations) 

S/N 0.01 2500 m 2 /K \ / l kHz 1 week \ 1/2 
~1TV~ A eS /T sys J \Au ch t~ t J ' 



Smin = 16 mJy 



where Az/ ch is the bandwidth of each channel (assumed smaller than the feature 
of interest) and t- m t is the total "on-source" integration time. These telescope 
parameters are similar to those expected for the SKA at z — 10, shown in the 
mock spectra of Figure 37. Clearly, observing any of the narrow features de- 
scribed above requires long integrations on SKA-class instruments and rather 
bright sources; for example, Cygnus A would have S u < 20 mJy if it existed 
at such redshifts. 

This requirement can be alleviated in some circumstances. First, minihalos and 
filaments introduce an extra fluctuating component to the spectra that could 
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be detected statistically by, for example, computing a running rms noise statis- 
tic. This reduces S m { n by a factor of at least three [349]. Second, HII regions 
can have significantly larger widths. Bubbles with Az/ obs ~ 100 kHz reduce 
the required optical depth to r ~ 10~ 3 at S min = 16 mJy, and significantly 
weaker sources would suffice if r is larger. 

Unfortunately, these thresholds are still relatively large. Interestingly, when 
extrapolated to higher frequencies, the possible background sources are well 
within reach of existing surveys (such as FIRST [438]), assuming that they 
have spectra typical of radio galaxies. Thus, appropriate sources may have 
already been detected. Of course, none have been identified so far, and that 
step will pose the real challenge. 

The most promising background sources are radio-loud quasars, although their 
luminosity function is relatively unconstrained at z > 4 [439]. Under the rea- 
sonable assumption that their abundance declines along with the bright optical 
quasars, the sky would contain ~ 1000 (10) sources at z > 8 (12) [349]. Models 
of the radio luminosity function over a range of redshifts (matching existing 
data where available) predict ~ 2000 sources across the sky with S > 6 mJy 
at 8 < z < 12 [440] (although this is quite sensitive to the assumed radio- 
loudness parameter). According to this model, the FIRST survey should have 
already found > 10 3 objects with z > 7 - albeit out of a total sample of 
~ 750, 000 sources! These estimates predict reasonably large number of lines 
of sight to z < 12. Beyond that point, however, we will likely be limited to 
just a few extreme objects. This is unfortunate in light of Figs. 38-40, which 
show that the absorption rapidly weakens during reionization itself. The rare, 
high-redshift sources have the potential to provide the most information about 
the early universe, but guessing at their abundance is nearly impossible. It is 
therefore crucial to push searches to the highest possible redshifts. 

Fortunately, there are no fundamental reasons to expect a cutoff in the radio 
source population at high redshifts. While the evolving IGM will affect ex- 
tended radio lobes, compact radio sources are driven by local jet physics and 
so should not depend on the large-scale environment [440]. The CMB energy 
density does increase at higher redshifts, increasing the importance of inverse- 
Compton cooling relative to (radio) synchrotron emission. The two become 
energetically comparable at z ~ 6, but this should only steepen the spectrum 
rather than quench the radio emission [349]. Such steepening could be useful 
in identifying high-redshift radio sources. Most persuasively, the fraction of 
radio-loud objects does not seem to evolve, even to z ~ 6 [441,442]. 

Another possible set of sources are GRBs and hypernovae. Unfortunately, 
although they should occur at high redshifts [27,30], afterglow models predict 
that only the most energetic events achieve the required flux densities, and only 
then if they occur in exceptionally diffuse environments [443]. Thus it seems 
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unlikely that transient sources can be used for the 21 cm forest, although 
detecting 21 cm absorption from their host galaxies is not at all unreasonable 
[427]. 



11 Complementary Observations 

Before closing, we wish to consider 21 cm observations in relation to other 
probes of the high-redshift universe. As we shall see, all of these other tech- 
niques require luminous sources and so focus on the eras of the first luminous 
objects and of reionization (§7 and 8). Between the surface of last scattering 
and the time that the first luminous sources appear, the 21 cm sky is the only 
known probe available to us (assuming that galaxies reionize the Universe, of 
course) . 

11.1 Galaxy Surveys 

The utility of comparing 21 cm surveys to the galaxy (and quasar) distribution 
is obvious: it will identify the sources responsible for reionizing each slice 
of the IGM (or at least their descendants). Of course, such galaxy surveys 
are intrinsically difficult, as the sources are extremely faint (because of their 
distance and small sizes). Moreover, Lja absorption by the intervening IGM 
is essentially complete for rest wavelengths blueward of 1216 A at z > 6, 
requiring near-infrared cameras that are generally less advanced than their 
optical counterparts. 

Nevertheless, surveys have already begun to probe the reionization epoch. 
Deep integrations with the Hubble Space Telescope, such as the GOODS sur- 
vey and the UltraDeep Field, have detected a number of galaxies at z ~ 6 
using the photometric dropout technique (e.g., [444-446]) and perhaps even 
some objects at higher redshifts [447,448]. Population synthesis models, in 
conjunction with optical and infrared observations, can also be used to iden- 
tify high-redshift candidates; this technique has already led to some surprising 
identifications of massive (M* > 10 11 M Q ) galaxies at z ~ 6-7 [449]. Large- 
format ground-based surveys have also successfully identified photometric 
dropouts [450,451]. 

An alternative to dropout surveys (which require deep spectroscopic followup 
for confirmation) is to search for objects with strong Lya emission lines. Spe- 
cific redshift windows can be searched with narrowband filters, which offer 
the enormous advantage of working between the bright sky lines that nearly 
blanket the near-infrared sky. Observations in one such window at z — 6.56 
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have already found dozens of objects [452-456], and windows at z ~ 8-10 are 
now being explored [457-460]. Unfortunately, these surveys can only detect 
the brightest objects (although gravitational lensing can probe further down 
the luminosity function in small patches of sky [460-462]). Substantial sam- 
ples of less unusual galaxies must await larger telescopes more optimized for 
the near-infrared, such as the James Webb Space Telescope or a ground-based 
30 meter telescope. Such surveys are interesting for any number of reasons, 
but drawing conclusions about reionization is difficult because of the many 
uncertainties in the stellar initial mass function, metallicity, escape fraction, 
and the IGM recombination rate [463,464]. Thus the correlation with 21 cm 
maps may in the end prove the best way to learn about this aspect of the 
sources. Of course, such a comparison will be easiest with a well-understood 
galaxy population (which is relatively difficult for Lya-selected galaxies). 

Quasar searches have also reached z ~ 6, although again only at the bright 
end of the luminosity function [4-7,11]. Nevertheless, these surveys have al- 
ready taught us that bright quasars are much too rare to be responsible for 
reionization. These luminous objects also provide an intriguing set of targets 
for 21 cm imaging, because their HII regions are likely to be much larger than 
average (see §8.4). 

While the merits of comparing the galaxy distribution with HII regions de- 
tected by 21 cm observations are clear, galaxy observations can also teach us 
about the ionized gas distribution on their own. Consider a galaxy embedded 
in the neutral IGM. Some ionizing photons leak out into the IGM and create 
an HII region, but most are absorbed within the galaxy and reprocessed into 
Lja photons. When these photons diffuse out of the galaxy, they will ini- 
tially propagate through the HII region, suffering relatively little absorption. 
However, once they encounter the neutral IGM, they will quickly be absorbed 
unless they have traveled far enough to redshift out of line center. Of course, 
the amount of redshifting depends on the size of the host HII region; thus we 
expect strong Lya lines to gradually fade toward higher redshifts [37-39]. 

Quantitatively constraining Xi(z) with this tool is difficult for two reasons. 
First, the intrinsic characteristics of the galaxies - especially their winds - can 
strongly affect their Lja line properties [40]. This problem can be alleviated 
(to some extent) by comparing galaxy populations at two different redshifts 
(preferably closely spaced enough in time to avoid significant cosmic evolu- 
tion). For example, Lya-selected samples at z = 5.7 (which we know to be 
after reionization) and at z — 6.5 (which may be in medias res) show no 
significant differences between their luminosity functions, at least within the 
statistical errors of the observations [16]. The second "problem" is that the 
bubble sizes depend on the sources of reionization [17,465-467]; see Figure 41 
for an illustration of the constraints that can be set. 
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Fig. 41. (a): Lya luminosity functions at z = 6.5. The thick solid line shows an 
assumed intrinsic function, with luminosity L oc m gaj \. The observations require the 
measured luminosity function to lie above the thin solid line [16]. The other curves 
show how IGM Lya absorption affects the luminosity function at several different 
Xj. (b): Ratio of attenuated to intrinsic luminosity functions. From [17]. 

Fortunately, this second "bug" is actually a powerful tool for studying reion- 
ization. Although large bubbles weaken the absorption, their existence at rel- 
atively early times also implies that Lya surveys will successfully find galax- 
ies throughout the middle stages of reionization. Moreover, the survey selec- 
tion function will be modulated by the ionized bubbles: lines of sight passing 
through large HII regions will contain many galaxies while other lines of sight 
will appear entirely empty even though their intrinsic galaxy densities could 
be nearly as large [17,466]. Thus Lya surveys can be used to map the bubble 
distribution (albeit only indirectly), making them the best known alternative 
to 21 cm measurements. The major difficulties are surveying sufficiently large 
volumes to the required sensitivities and understanding the intrinsic Lya prop- 
erties and clustering of the galaxies (which is probably easiest in conjunction 
with dropout surveys). Nevertheless, comparing the two kinds of maps offers 
exciting possibilities to learn about the sources and their interaction with the 
IGM. 

Galaxies can also be studied indirectly through the near-infrared (NIR) back- 
ground and its fluctuations [468]: these same Lya photons travel through the 
IGM without being destroyed (only scattered) and eventually redshift to wave- 
lengths > 1 fim at the present day. Thus the NIR background contains infor- 
mation about the recombination history (which, of course, also constrains the 
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ionizing luminosity) [469,470]; provided that most recombinations occur inside 
or near galaxies (as is true if / esc <C 1), its fluctuations directly trace those 
of the galaxy population [471-473] and are therefore complementary to 21 cm 
tomography of the IGM. However, studies of the NIR background suffer from 
many of the same challenges as 21 cm observations: the signal is strongly dom- 
inated by foregrounds (in this case the zodiacal light and moderate redshift 
galaxies), which must somehow be separated from the cosmological signal. As 
a result, the recent claimed detection of Pop III stars via their fluctuations in 
deep Spitzer fields [474] remains controversial [475,476]. There may be useful 
synergies between the data analysis techniques of both fluctuation searches 
and in cross-correlation of the two datasets (to eliminate those foregrounds 
that contribute to only one probe); the main challenge is the large disparity 
in angular scales subtended by the relevant telescopes: the NIR searches use 
fields < 30 arcmin across, while 21 cm observations will span hundreds of 
square degrees. 



11.2 Quasar and GRB Spectra 

Just as with the 21 cm line, Lyman series absorption in the spectra of high- 
redshift quasars traces neutral hydrogen in the early Universe. However, since 
the cross sections for such permitted transitions are ~ 10 7 times larger, opti- 
cal/UV absorption spectra saturate much more rapidly than the 21 cm line 
(which is still optically thin even in a fully neutral universe). Quasar spectra 
are therefore primarily useful for studying the tail end of reionization, when 
the neutral fraction is small. This very property makes quasar absorption spec- 
tra highly complementary to 21 cm probes, because they are most powerful 
precisely when xhi (and the 21 cm signal) are plummeting. On the other hand, 
despite the small neutral fraction, the tail end of reionization has rich physics 
and any number of unanswered questions. For example, how does the tran- 
sition from the "bubble-dominated" topology characteristic of reionization to 
the "web-dominated" topology we see at lower redshifts (where Lyman-limit 
systems embedded in filaments consume most of the ionizing photons) occur? 
How large do the ionized bubbles get? How does the IGM dumpiness evolve 
as the Universe is ionized and heated? Quasar spectra are ideally suited to 
answer these and other questions that arise when the 21 cm sky is fading. 
However, we must keep in mind that, while quasar spectra can potentially 
constrain the global neutral fraction and topology of reionization, in practice 
these are difficult questions, because making inferences about the ionization 
state of the universe from a saturated absorber is fraught with uncertainty. 

As with many topics we have discussed in this review, our understanding 
of the z ~ 6 universe as revealed by quasar spectra is evolving rapidly; to 
date, 19 quasars with redshifts 5.74 < z cm < 6.42 have been discovered in 
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the SDSS [7]. Thus far, the following spectral features have been used to 
deduce IGM properties: (i) the evolution of the mean Gunn-Peterson optical 
depth in Lja, (3, and 7 transitions, as well as its line-of-sight scatter; (ii) 
the distribution of lengths of dark absorption gaps and transmission spikes; 
(iii) the detection of damping wings; (iv) the size of HII regions around high- 
redshift quasars; and (v) metal absorption lines. We will discuss each of these 
in turn (see Fig. 1 for a summary of some of the resulting constraints). 

Because only a small fraction of the IGM need be neutral for Lja Gunn- 
Peterson absorption to saturate (see eq. 43), Ly/3 and Ly7 absorption provide 
better limits by a factor of 2-4, though they are affected by uncertainty in 
the foreground absorption from lower-order transitions. 56 Since most trans- 
mission arises in underdense voids (where the recombination time is longest), 
quasar spectra provide little information about regions of the IGM where most 
baryons reside [9] ; thus, estimates of the neutral fraction can at best constrain 
xm > lCT 3 -iCr 2 and are dependent on theoretical models of the IGM density 
distribution. Nonetheless, striking observational patterns emerge for some rea- 
sonable models. According to [7] (who use the IGM model of [10]), the effective 
optical depth r c g- = — ln(T) (where T is the transmission fraction) increases 
rapidly at high redshift, accelerating from r cff oc (1 + z) 4 - 3 to (1 + z)- 11 at 
z > 5.7. This rapid evolution has been interpreted by some to indicate sudden 
evolution in the radiation field and mean free path of ionizing photons, as 
might be expected when the ionized bubbles percolate (e.g., [11,477,478]). On 
the other hand, an empirically motivated model of the mean optical depth evo- 
lution does not show such a break and implies that no such dramatic event is 
occurring [479] (see also [8]). Better data and modeling will clearly be needed 
to settle this debate. 

Accompanying this sudden evolution is a rapid increase in the dispersion of T e g , 
even when it is smoothed on scales > Mpc. Figure 42 shows the most 

dramatic example, the clear detection of transmission throughout the line of 
sight to SDSS J1148+5251 (z = 6.42) [9,480,481] despite uniformly complete 
absorption in the Lja, f3, and 7 troughs of SDSS J1030+0524 [z = 6.28) 
[33,480]. This has been interpreted as evidence for patchy reionization, or at 
the very least for a strongly fluctuating radiation field, which is expected near 
the end of reionization [7, 34] . However, such claims should be viewed with 
caution: numerical simulations show that the observed sightline-to-sightline 
variance is in fact consistent with density fluctuations in a uniform radiation 
field [35] (see also [36]). Transmission fluctuations can be of order unity on 
large (~ 50 hr l Mpc) scales for two reasons: (i) transmission spectra are highly 
biased tracers of the underlying density fluctuations, because they are mainly 



56 In a clumpy IGM, /t| and t^/t^ are a factor 2-4 smaller than the ratio 
of oscillator strengths would suggest, reducing the leverage granted by these higher 
order transitions [8,9]. 
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Fig. 42. Quasar absorption spectra. In each panel, the lower curve shows the ob- 
served spectrum in the Lya forest region; the overlaid curve shows the corresponding 
Ly/3 spectrum. J1030+0524 [z = 6.28) shows saturated absorption in both transi- 
tions from z pa 5.95-6.2, while J1148+5251 (z = 6.42) shows transmission through- 
out the spectrum over the same interval (and even to higher redshifts). From [480]. 

sensitive to rare voids, and (ii) projected power from small-scale transverse 
modes is aliased to long wavelength line-of-sight modes. Thus, although it is 
quite likely that the ionizing background does contain substantial fluctuations 
at these epochs, it is extremely difficult to detect them. 

A useful statistic related to the observed scatter in transmission is the "dark 
gap distribution," or the size distribution of regions with r > r min [8, 200, 
466,482-486]. Because it still provides useful information when absorption is 
saturated, this is potentially the most useful quasar absorption probe in the 
nearly saturated limit. Although the relation between dark gaps and x H i is 
model-dependent, the relative evolution of gap lengths is a robust statistic. 
Indeed, simulations find that reionization causes a dramatic decrease in the 
average length of dark gaps, as well as its dispersion [200,485]. Only recently 
have simulations reached sizes comparable to dark gaps [374,486], which is 
essential for making quantitative comparisons. The data show a dramatic in- 
crease in mean dark gap length (denned with r min = 3.5) from i?dark < 10 to 
-Rdark > 80 comoving Mpc at z ~ 5.7 [7], with large line of sight variations: 
dark gaps as long as 50 Mpc appear by z ~ 5.5, while some lines of sight 
at z > 6 are still transparent [7]. This abrupt transition may or may not be 
consistent with a simple thickening of the Lya forest [36]; there is probably 
more information to be mined from dark gaps. One recent study found that, 
with ~ 10 z > 6 quasars, the dark gap distribution can sharply distinguish be- 
tween different reionization histories [486] . It could also constrain the topology 
of reionization, at least in principle [484]. 

Note that a large dark gap does not necessarily correspond to a large neutral 
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region, because the strong damping wings of Lya can mask ionized regions. 
Given an estimate for the size distribution of HII regions, the absence or pres- 
ence of transmission spikes constrains Xui by limiting the total contribution 
from damping wings (see §11.1 for the application of this to Lja emitting 
galaxies). Again, large HII regions, created by clusters of high-redshift galax- 
ies, increase the probability of transmission spikes [466]. A novel test for damp- 
ing wings compares the Lya and Lj/3 transmission near the boundary of an 
HII region [14]. The presence of Ly/3 transmission coupled with saturated Lja 
absorption over a contiguous region Az ~ 0.02 in the proximity zone around 
SDSS J1030+0534 (z = 6.28) requires strong damping wing absorption in this 
region. If the absorption were solely due to the fluctuating opacity of residual 
HI in mostly ionized gas, there should have been a transmission spike due 
to an underdense region somewhere inside such a large stretch. The required 
damping wing implies xm > 0.2. However, this feature has only been seen in 
one quasar, and its statistical significance is still not clear. 

Yet another probe of the high-redshift IGM is the size of proximity zones 
around high redshift quasars. Provided that the region in which Lyman se- 
ries transmission can be seen corresponds to the edge of an HII region, this 
should scale as R oc a; HI 1 ^ 3 iV ion tQ(l + z)~ l in a uniform IGM, where N ion is 
the production rate of ionizing photons and tq is the quasar lifetime. Given 
appropriate ansatzen for these two parameters, it should be possible to infer 
xei', several authors have suggested, from the relatively small sizes of quasar 
HII regions around z ~ 6 quasars, that x H i > 0.2 [15,21,23]. These estimates 
are afflicted by uncertainties in the quasar lifetime, redshift (e.g., most quasar 
redshifts are determined by high-ionization lines such as CIV and SilV, which 
are systematically offset from the host galaxy's redshift), spectral template 
(i.e., iV; on for a given observed luminosity), IGM clumping factor, and the 
effects of nearby galaxies. Nonetheless, the differential change in proximity 
zone sizes should be a reasonably robust indicator of evolution in the neutral 
fraction. The observed sizes decrease rapidly toward high redshift, as might 
be expected if the IGM were becoming more neutral (see Fig. 1) [7]. 

Finally, an alternative probe of the IGM ionization state is to use metal ab- 
sorption lines such as 01, Sill, and CII [487], which tend to trace HI and 
have transitions redward of Lja. Since metal atoms are ~ 10 _5 -10~ 7 times 
less abundant than hydrogen in the IGM, their absorption does not saturate 
even in a fully neutral IGM. OI is a particularly attractive tracer: its ion- 
ization potential differs from hydrogen by only AE = 0.02 eV, so it should 
sit in tight charge exchange equilibrium with HI. It is nevertheless imper- 
fect, because metals do not fill all of space and tend to lie preferentially in 
the ionized regions around galaxies. However, if there is substantial early star 
formation (and hence metal pollution), and if metals lie in overdense regions 
(which recombine quickly once nearby ionizing sources shut off), the test may 
prove useful. Recent high-resolution observations found a sharp rise in the 
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abundance of 01 absorbers at high redshifts, with abundances comparable to 
the predicted number of Lyman limit systems [479]. However, there are strik- 
ing line-of-sight variations; four of the six observed systems lie toward SDSS 
J1148+5251 (z = 6.42), even though nine lines of sight were observed. It is 
highly unlikely that this is due to variations in sensitivity or random fluctu- 
ations. The interpretation of this puzzling result is still unclear, but it likely 
signals inhomogeneous metal pollution and/or reionization. 

Throughout this section, we have referred exclusively to quasar spectra, but 
of course any other luminous, high-redshift background source will do. GRBs 
are a particularly interesting alternative, because they likely occur at high 
redshifts (so long as they approximately trace the star formation history of 
the Universe [27]). Moreover, cosmic time dilation implies that, as the burst 
redshift increases, a fixed observer time corresponds to earlier and earlier times 
in the frame of the burst; because bursts fade rapidly, this helps counteract 
the usual decline of flux with luminosity distance and makes GRBs visible 
to higher redshifts than naively expected [28,29]. They have the theoretical 
advantage of simpler intrinsic spectra, without large proximity zones or Lja 
emission lines, that should in principle allow easier extraction of the shape 
of the red damping wing - which in turn offers a sensitive constraint on the 
neutral fraction [24,26]. However, most GRBs have strong damped-Lya ab- 
sorbers from the host galaxies, making the damping wing test much more 
difficult (though it can still be applied, because the line profiles differ in the 
two cases; see §1.1). The first (weak) constraints came from GRB 050904 at 
z = 6.3 [19,30], which was observed 3.4 days after the burst. Faster identifica- 
tion and followup of high-redshift GRBs will allow even more sensitive tests 
of the damping wing (especially because some GRBs do appear to have weak 
local absorbers; see the compilation in [31]), and of course measurements of 
the intervening IGM are independent of the local absorption. 

In summary, the picture painted by quasar absorption studies of the z ~ 6 IGM 
is highly complex and unclear: interpretations range from significantly neutral 
to highly ionized, and the topology of reionization remains quite controversial. 
Prospects for improvement are good, but they require more lines of sight to be 
identified and better modeling of the existing data. Ultimately, 21 cm studies 
have the power to strip away most of these uncertainties! 



11.3 CMB Polarization 



While 21 cm observations are sensitive to the neutral hydrogen density, CMB 
polarization experiments are sensitive to the free electron density, and there 
is obviously a good deal of potential for synergy between the two. 
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CMB temperature anisotropies are relatively unaffected by reionization on 
large scales, but on small scales they are suppressed by a factor exp(— 2r es ), 
where r es = a T f$ drj' n e is the electron scattering optical depth and rj is 
the conformal time. The characteristic angular scale dividing these regimes 
is l r = DA.[z(r) r )]/r) r , where r\ r is the visibility- weighted conformal time (es- 
sentially describing the distance to the reionization surface; see below for a 
precise definition of the visibility function) [488]. Similarly, the angular power 
spectrum of polarization is suppressed on small scales by the same factor 
exp(— 2r es ). However, on large scales, the rescattering of photons also creates 
a broad peak in polarization power at I ~ l r [305,306]. This polarization 
anisotropy is sourced by the temperature quadrupole seen by scatterers dur- 
ing reionization, which is much larger than the temperature quadrupole at the 
surface of last scattering. The amplitude of the bump depends most strongly 
on t cs and on the amplitude of primordial potential perturbations. Other pa- 
rameters matter much less; for instance, the signal is nearly independent of 
the baryon density, because the modes contributing to the reionization signal 
have k ~ 2/i] r and enter the horizon well after reionization. 

Particularly if r es is large, most of the information on reionization comes from 
the auto-correlation power spectrum C l EE , rather than the cross-correlation 
C l TE between polarization and temperature. 57 This is primarily because C l EE oc 
Tg S , whereas C l TE oc r cs . Also, in the cosmic variance limit, the fractional 
uncertainty in C l EE is smaller than the fractional uncertainty in C l TE unless 
they are perfectly correlated. Furthermore, C l TE correlates quantities which 
at fixed k have different angular frequencies on the sky, because the tempera- 
ture anisotropies are projected from the more distant surface of last scattering. 
The matching angular frequencies of C l EE create well-defined secondary peaks, 
whereas the mismatch in C l TE washes out fluctuation power, suppressing the 
peaks. 

The first detection of reionization in CMB data was reported by the WMAP 
team with their first year data; from C l TE , they found r cs = 0.17±0.08 (2a) [43, 
44]. Two more years of data significantly improved the measurement, especially 
because they were able to detect C l EE . The present best-fit value (using WMAP 
data alone) is r es = 0.088lo;o34 [20,45]. Because r es suppresses the amplitude 
of the CMB, other constraints on the power spectrum normalization (such 
as large-scale structure) are also useful; the best fit measurement including 
several other cosmological datasets is r cs = 0.069to;g29- 

Is any other information available beyond a simple measure of the integrated 
optical depth r es ? Unfortunately, the dominant modes in the large scale po- 
larization signal have k ~ 2/r] r . We expect little correlation between ionized 



57 Here "E" refers to spin- free or E-mode polarization [489-491]; reionization does 
not generate the complementary divergence-free or B-mode polarization. 
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bubbles on horizon scales, and CMB polarization is essentially independent of 
the topology of reionization. However, even if CMB polarization experiments 
cannot probe patchy reionization, future high-precision measurements could 
constrain the global reionization history Xi(z) and distinguish reionization sce- 
narios with identical total optical depths r es [492,493]. 

The leverage comes from the shape of the power spectrum; for models with 
substantial partial reionization at high redshifts, there is significantly more po- 
larization power at smaller angular scales. Qualitatively, this occurs because 
the horizon is smaller at high redshifts. Again, most of the information comes 
from C l EE ; we show some example power spectra, together with their corre- 
sponding reionization histories, in Figure 43 (from [493]). Here the error bars 
include cosmic variance only; thus, even with these radically different reioniza- 
tion histories, the observable differences are relatively small and constraints 
will be crude. CMB analyses typically characterize reionization as a sharp 
step-function at some redshift z r . Such an unrealistic reionization history can 
significantly bias the estimate of r es in a manner that depends on the true 
Xi(z) [492,493]. This bias is smaller than the statistical errors of WMAP but 
could easily be 5r cs > 0.01 for the Planck satellite, compared with statistical 
errors 5r cs ~ 0.005 and the cosmic variance limit of Sr cs ~ 0.002-0.003 [493]. 
It can be reduced by fitting slightly more complicated reionization histories 
(such as two-stage reionization), though uncertainty in this modeling may ul- 
timately prevent optical depth measurements from reaching the true cosmic 
variance limit. The accuracy with which polarization can constrain the reion- 
ization history is likely similar, with information on only ~ 2-3 broad redshift 
bins. The accuracy of 21 cm experiments in tracking the volume filling fraction 
of ionized regions should be much higher (see §8). 

In practice, foregrounds are likely to be the key difficulty, and cleaning strate- 
gies are still under development (see [45] for an example). If foregrounds do 
impose a limit well above cosmic variance, the advantages of all-sky surveys 
are less apparent, and ground-based polarization experiments could be com- 
petitive with space-based missions [494]. We refer the interested reader to [3] 
for a more in-depth discussion of the future prospects of CMB polarization 
measurements. 



11. 4 Small- Scale CMB Anisotropics 

The same scattering processes that produce large-scale polarization also im- 
print temperature anisotropies on the CMB through the peculiar velocity of 
the scattering medium, which imparts a blue- or redshift to the scattered pho- 
tons. This "Doppler" or "kinetic Sunyaev-Zeldovich" (kSZ) effect has several 
observable consequences. 
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Fig. 43. Large-scale CMB polarization power spectra (right panel) for five different 
reionization histories (left panel). All are normalized to have identical power at 
I = 50. The thin dashed curves in the right panel show the best fit instantaneous 
reionization histories to models 4 and 5. The error bars show the cosmic variance 
limit. From [493]. 

The Doppler anisotropy toward direction n on the sky is 



where v is the local peculiar velocity. We define g v (z) = r es e~ Tcs = &t n e a e~ Tcs 
as the visibility function; it is the probability that a photon last scattered at 
rj ± drj. (Throughout this section, an overdot will represent d/drj.) 

From equation (170), one might hope that linear order velocity fluctuations 
would source temperature perturbations. Unfortunately, the kSZ effect van- 
ishes to linear order, because the crests and troughs of a perturbation produce 
equal and opposite Doppler shifts [495]. Thus anisotropy only appears if either 
v or g evolve across the perturbation, and it is a second order effect. Obviously, 
g v {z) can only evolve significantly across a perturbation that spans relatively 
large scales. For reasonable reionization histories, the signal has a broad peak 
at I < 100; on larger scales, the previously described cancellation damps out 
the anisotropies. However, even at I ~ 100, the Doppler contribution is more 
than an order of magnitude smaller than the primary CMB anisotropies [283]: 
we need some way to isolate the contribution from reionization. 

The 21 cm background provides exactly such a tool [287], because it traces 
both large scale overdensities (which source velocity perturbations) and fluc- 
tuations in the ionized fraction. For example, consider an overdense region 
during (uniform) reionization. On the far side of the perturbation, the Uni- 




(170) 
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verse will be mostly neutral and hence little scattering will occur; the near 
side, on the other hand, will be mostly ionized, imparting a net redshift to 
the CMB. Thus, in the simplest model (in which overdense regions are not 
ionized significantly earlier than underdense regions), the 21 cm signal and 
the CMB will be anti-correlated. On large scales, the cross-correlation can be 
written [287] 



l 2 Q T 1 5T b D{z) 



2tt 



2tt 



Xi (z) P x 8t 



d_ 

drj 



[Dg v ]. (171) 



As in equation (133), 5l denotes the linear density field and marks those 
terms arising from the velocity fluctuation in equation (170). The first term 
in square brackets comes from the correlation of this velocity with the large 
scale density fluctuations in the 21 cm signal; the P x s L term arises from the 
evolution of the ionized fraction across the density fluctuation. Note that these 
terms have opposite signs: ionizing overdense regions first cancels part of the 
cross-correlation by moderating the evolution of the electron fraction across 
the perturbation. In realistic models, this term can easily dominate [287], so 
the sign of the correlation provides information on reionization. It obviously 
peaks at Xj ~ 0.5 and therefore constrains Xi(z) directly as well. Simple models 
predict peak signal strengths ~ 200-500 /iK 2 [287], within reach of SKA- 
class observatories (although foregrounds could pose a problem on these large 
scales). 

This linear Doppler effect peaks on large scales (hundreds of Mpc) and hence 
does not depend strongly on the bubble properties (except insofar as their 
bias affects P X & L )- It also ignores the effect of small-scale inhomogeneities in 
either the density or ionized fraction on g v {z). By including these in equation 
(170), we see that the anisotropies depend on q = v(l + 5 + 5 X ). The angular 
power spectrum then has significant contributions from P vv Pss, P vv Pxx, and 
PwPsx (as well as four-point terms) [368,496]. The first of these (the Ostriker- 
Vishniac or OV effect) depends only on the density field and Xi(z) [497]. At the 
relevant scales the OV effect is dominated by nonlinear structure formation 
at low redshifts [498,499] and only depends weakly on reionization [356,368] 
(though see [500]). 

More interesting from our point of view are the so-called "patchy reioniza- 
tion" terms P VV P XX and P VV P$ X [366,367,501]. Physically these are sourced by 
HII regions subtending only part of a large-scale velocity fluctuation: the non- 
uniform ionized fraction prevents the usual linear-order cancellation. The scale 
dependence of the patchy signal will therefore depend on the bubble size distri- 
bution, and its amplitude on the duration over which patchiness persists. The 
resulting signals have been calculated analytically [342,356,366-368,501] and 
via simulations [364, 502-504] for a number of different reionization scenarios. 
Figure 44 shows some illustrative examples for a range of reionization histo- 
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Fig. 44. Kinetic Sunyaev-Zeldovich anisotropies (left panel) from several different 
reionization histories (right panel; here Q is the mean ionized fraction). In the left 
panel, the lower set of curves shows the "patchy reionization" component while the 
upper set includes the patchy, OV, and primary CMB anisotropies. We also show 
the anticipated la error bars for the Atacama Cosmology Telescope [505], assuming 
perfect foreground removal. From [356]. 

ries. We plot the patchy reionization component (the lower set of curves) along 
with the "total" signals (including the patchy, OV, and primary anisotropies, 
but neglecting the thermal Sunyaev-Zeldovich component). This calculation 
is based on the bubble model presented in §8.2 (see also [341]); the peak at 
I ~ 3000 corresponds to the characteristic bubble size at X{ ~ 0.5 in these 
models. Comparison with the X\{z) curves in the right panel clearly shows 
that the overall amplitude depends strongly on the duration of reionization, 
because persistent patchiness helps to build up the signal. Simulations yield 
similar amplitudes [364,504]. 

Small-scale CMB temperature anisotropies therefore provide an integrated 
measurement of the duration of reionization as well as some limited informa- 
tion on the bubble size distribution. (Note as well that patchy reionization 
will bias cosmological measurements that use the tail of the primary CMB 
anisotropies [364,368].) The expected amplitudes are well within reach of the 
next generation of ground-based CMB experiments (see the sample error bars 
in Fig. 44). The main difficulty, as usual, is foreground contamination. At 
most frequencies, the thermal Sunyaev-Zeldovich effect dominates by a large 
factor; fortunately, it has a null at v = 218 MHz and can be cleaned efficiently 
through multifrequency measurements. More worrying is point source contam- 
ination, which could present a substantial problem [506]. Finally, Figure 44 
also shows that the patchy contribution will be smaller than the OV effect from 
low redshifts, which must be modeled properly to isolate the contribution from 
reionization. 

Intuitively, one would expect these patchy anisotropies to be anti-correlated 
with 21 cm fluctuations, because both are produced by ionized bubbles during 
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reionization. However, there has been some controversy over this naive expec- 
tation. The cross-correlation has terms proportional to P V $P XX and P vx Ps x . In 
analytic models, there is a scale mismatch between the arcminute 21 cm fluc- 
tuations and the velocity fluctuations that source the CMB anisotropics [283]. 
The CMB signal must be integrated along the line of sight, and the highly 
oscillatory mode-coupling integral cancels out most of the signal. One can 
avoid the cancellation by constructing a bispectrum (the Fourier-space equiv- 
alent of the three-point function) between the 21 cm signal, the small-scale 
kSZ anisotropy, and the large-scale Doppler anisotropy, because the last of 
these directly samples the velocity field [283]. Unfortunately, even with ideal 
experiments extracting the signal will be a challenge. 

However, direct calculations with numerical simulations show that the cross- 
correlation between 21 cm fluctuations and small-scale CMB anisotropics does 
not vanish [504]; instead they are anti-correlated on scales smaller than the 
characteristic bubble size at the appropriate redshift. The evolution of the 
zero point therefore constrains the growth of HII regions during reionization 
and can help extract information from the 21 cm signal. The reasons for the 
discrepancy with the analytic predictions is not clear, but it most likely lies 
in the assumption P xx oc Pgs made in the analytic model. As we have seen 
(§8.2), the bubble power spectrum actually has features on much larger scales 
than the density field and is imperfectly correlated on smaller scales. Thus the 
apparent cancellation may be a result of the simplified analytic model used 
by [283]. 

Another CMB signature of early structure formation could take the form of 
the thermal rather than the kinetic SZ effect: energy from high-redshift super- 
novae must have been deposited in the CMB via Compton cooling, and the 
strong clustering bias of high-redshift halos means that strong anisotropies 
can develop without violating constraints on the Compton y-distortion pa- 
rameter [507]. While uncertainties are large, a thermal SZ contribution from 
high-redshift supernovae could in principle explain the excess anisotropy at 
arcminute scales seen in the CBI [508], BIMA [509,510], and ACBAR [511] 
experiments. If they are due to thermal SZ from galaxy clusters, the observed 
anisotropies require a 8 ~ 1 [512]; since the cluster SZ signal is q oc <r| [513], 
the tension increases still further given the lower value <7g = 0.74lg;og found 
in the three-year WMAP data. Because star-forming halos source HII regions 
an anti-correlation between the small-scale SZ and 21 cm signals may be de- 
tectable [514]. The true source of this small-scale CMB anisotropy will become 
clearer with more sensitive experiments such as the Atacama Cosmology Tele- 
scope, which will be able to resolve out the cluster contribution (but not the 
high-redshift contribution). 

Finally, inhomogeneous scattering during reionization also produces small- 
scale anisotropies in the CMB polarization [498]. These are, unfortunately, 
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much smaller than the primary polarization anisotropies at the relevant scales, 
so isolating them requires a tracer field such as the 21 cm signal [284]. The 
simple cross-correlation vanishes because the polarization depends on the 
background CMB quadrupole (which is uncorrected with the 21 cm fluc- 
tuations), so the signal can only be extracted through the bispectrum of the 
large-scale CMB temperature anisotropies, the small-scale CMB polarization 
anisotropies, and the 21 cm (or some other tracer) field [284]. It may be de- 
tectable with the combination of high signal-to-noise CMB polarization maps 
and SKA-class 21 cm telescopes. 



12 Concluding Remarks 

We hope to have convinced the reader of the unparalleled promise of 21 cm 
observations for unlocking the mysteries of the high-redshift Universe - as 
well as the pitfalls that lie ahead before observations can be successful. These 
challenges - including (but most likely not limited to) terrestrial interference, 
ionospheric distortions, foreground contamination (especially by the polarized 
component), and beam-shape control - are formidable indeed and should not 
be underestimated. But, already in the past three years, enormous strides 
have been made both on the instrumental and data analysis sides, and we 
have every hope that each of these difficulties can be overcome. In many 
ways, the 21 cm community today is analogous to the CMB community in 
the 1980s: a clear observational goal exists, but experiments are just starting 
to explore the landscape. We do not know, of course, how the potholes along 
this road compare to those along the path toward the CMB, but the scientific 
return is clearly large enough that sustained exploration is worthwhile. On the 
other hand, the final destination (~ 20 mK fluctuations from HII regions and 
~ 3 mK fluctuations from the underlying density field) is better known (or so 
we hope) , because CMB and other observations have lifted the fog concealing 
the basic cosmological and structure formation parameters. 

The richness of the physics available through the 21 cm line clearly makes 
even a long journey down this road worthwhile. Because it is (by definition) 
the last epoch to produce a strong signal, and because the observational chal- 
lenges are smallest at high frequencies, the era most amenable to observations 
is reionization itself - which is, fortunately, also the most interesting epoch 
from an astrophysical perspective. What sources are responsible for it? How 
does feedback affect them? How do these galaxies interact with the IGM? In 
conjunction with the other methods described in §11, the first generation of 21 
cm experiments promises to open reionization for detailed study. But the 21 
cm transition is unique in that only it can directly probe the three-dimensional 
morphology and evolution of HII regions. Our most nagging concern about the 
potential for these experiments to study reionization - that it may have oc- 
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curred at such high redshifts that it will lie beyond their reach - seems less 
pressing with the latest WMAP data [20,45]. 

Although foregrounds make higher redshifts more difficult to observe, those 
epochs also contain fascinating information - the 21 cm line could allow us 
to probe structure formation from its earliest, linear phases (z > 50) through 
the formation of the cosmic web and the first luminous sources. At the highest 
redshifts, the 21 cm line provides a rich testbed for cosmology, probing much 
smaller scales than the CMB and making three-dimensional measurements 
over a wide redshift range. Once the first stars and galaxies form, the 21 cm 
line becomes a sensitive measure of their influence on the IGM - providing 
perhaps our first view into the secrets of these objects (albeit an indirect one). 

The first generation of 21 cm experiments, which are now under construction, 
will make the first strides down this road, and the next several years promise 
to be a truly exciting time in cosmology. 
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